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At the 2003 National Center for Education Statistics (NCES) Summer Data Conference, scholars in the field 
of education finance addressed the theme, “Data Changing Our World.” Discussions and presentations dealt 
with topics such as the effects of salary and working conditions on teacher turnover, determining the cost of 
improving student performance, and measuring school efficiency. 

Developments in School Finance: 2003 contains papers presented at the 2003 annual NCES Summer Data 
Conference. The presenters are experts in their respective fields, each of whom has a unique perspective or who 
has conducted quantitative or qualitative research regarding emerging issues in education finance. It is my 
understanding that the reaction of those who attended the Conference was overwhelmingly positive. We hope 
that will be your reaction as well. 

This volume is the eighth education finance publication produced from papers presented at the NCES Sum- 
mer Data Conferences. The papers included in this volume present the views of the authors, and are intended 
to promote the exchange of ideas among researchers and policymakers. No official support by the U.S. Depart- 
ment of Education or NCES is intended or should be inferred. Nevertheless, NCES would be pleased if the 
papers provoke discussions, replications, replies, and refutations in future Summer Data Conferences. 
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Introduction 



The papers included in this volume of fiscal proceed- 
ings were presented by education finance experts 
at the July 2003 NCES Summer Data Conference. 
The presenters were invited by the editor to contrib- 
ute their papers to this volume because, in his opin- 
ion, their work in elementary-secondary public school 
education finance is among the leading work in the 
field. The following paragraphs present an overview 
of each of the papers in this volume, in the order in 
which they appear. For each paper, the title (in bold) 
and list of authors and their affiliations introduce the 
paper summary. 

The Revolving Door: Factors Ajfecting Teacher Turn- 
over. In this paper, Eric A. Hanushek of the Hoover 
Institution at Stanford University, the late John E 
Kain of the University of Texas at Dallas, and Steven 
G. Rivkin of Amherst College use Texas teacher data 
to conclude that Texas public school teachers’ work- 
ing conditions matter more to them than salary. The 
authors state that although experienced teachers, on 
average, are more effective at raising student perfor- 
mance, they typically either leave teaching or flee from 
urban to suburban schools. What has not been well 
understood, the authors assert, is whether experienced 
teachers leave schools with high concentrations of dis- 
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advantaged and low-achieving students for reasons 
of compensation, or because of their working condi- 
tions. 

Hanushek, Kain, and Rivkin state the reason this is- 
sue has not been resolved has been the difficulty of 
separating the effects of teachers’ salary from the 
effects of their working conditions and preferences. 
This requires databases with detailed information on 
enough teachers and students to statistically distin- 
guish what influences teachers’ decisions. Using data 
from the state of Texas for elementary schools from 
1993 to 1996, the researchers were able to construct 
such a database. 

The authors report that teachers leave teaching or trans- 
fer from one school to another in response to the char- 
acteristics of their students more than better salaries 
in other schools. They posit that this is why disadvan- 
taged, low-achieving students are in schools with rela- 
tively inexperienced teachers. Since salary does not seem 
to be the primary motivation for exiting, the authors 
suggest that improving the working conditions in these 
inner-city schools, as well as increasing the salaries of 
“quality” teachers, should be considered by 
policymakers. 
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Financing Urban Schools: Emerging Challenges for 
Researeh, Poliey, and Praetiee. In this paper, Christo- 
pher Roelike of Vassar College and Jennifer King Rice 
of the University of Maryland review contemporary 
research on the financing of urban school systems. The 
dilemma for the nation’s large urban schools, Roellke 
and Rice posit, is that they are particularly vulnerable 
to funding reductions as states and localities respond 
to lower revenues and deficits, at the same time that 
they are the focus of increased standards and account- 
ability. In part, the increased demands are because 
policymakers wish to close the achievement gap be- 
tween low-income, minority students and more for- 
tunate, better achieving students. The authors, in the 
first volume of a new book series on education fiscal 
policy and practice, asked a group of school finance 
experts to address the critical challenges in financing 
urban education, and they synthesize the key themes 
that arose. 

Roellke and Rice report that the solutions to the fund- 
ing problems of urban schools are not easily addressed 
by research. Currently, urban schools have many ser- 
vice delivery options available to them. Examples in- 
clude implementing class-size reduction, alternative 
scheduling, summer enrichment, early intervention 
programs, and a wide array of whole-school reform 
models. To date, education finance researchers have 
little to offer on the cost or the effectiveness of alterna- 
tive practices to assist urban policymakers in choosing 
between service delivery options. In addition, it is ap- 
parent that no one solution can improve student out- 
comes in all schools. One contributor to the volume, 
Jennifer Imazeki, states that additional compensation 
alone appears to be insufficient to attract and retain 
high-quality teachers in urban schools. And the vast 
majority of public schools, according to contributors 
Schwartz, Amor, and Fruchter, receive private support, 
which for some urban schools represents over half of 
their children services funding. Thus, Roellke and Rice 
argue, the ability of some urban schools to implement 
certain reforms is dependent upon this episodic non- 
traditional funding. 

Roellke and Rice also report that urban school dis- 
tricts do not have the resources to close the achieve- 
ment gap between their low-income, minority students 
and more fortunate, better achieving students in other 
school districts. This inability has resulted in legal 
challenges in state courts, focused on allegations that 



the state education finance formula does not allow poor 
school districts to provide an adequate education. The 
authors state that some state courts, such as in New 
York, have required a costing-out study to set a dollar 
figure that would ensure an adequate education. 

Finaneing Edueation So That No Child Is Left Behind: 
Determining the Costs of Improving Student Perfor- 
manee. In this paper, Andrew Reschovsky of the Uni- 
versity of Wisconsin-Madison and Jennifer Imazeki of 
San Diego State University estimate the cost of achiev- 
ing a specified improvement in student performance. 
To do this, they use the characteristics of schools and 
students in Texas that may cause some schools to spend 
more than others to achieve a given student perfor- 
mance standard. The authors state that these costs are 
due primarily to factors over which local school offi- 
cials have little control, such as high concentrations of 
low-income students or ESL students who may require, 
for example, smaller classes or specialized programs. 
In addition, the authors state, some schools, because 
of their location and student composition, may have 
to offer higher compensation to attract and retain staff. 

Reschovsky and Imazeki suggest that substantial cost 
differences among school districts will render those 
with above-average costs unable to bring their students 
up to the new standards, unless these school districts 
receive additional aid. The authors report palpable cost 
differences in Texas. They then devise a cost- adjusted 
foundation formula for the state to send more funds 
to school districts with higher costs. 

Reschovsky and Imazeki caution that their estimated 
cost functions should not be interpreted to mean that if 
a school district with high costs is provided with suffi- 
cient additional funds it could meet state-imposed per- 
formance standards in a single year. It may take more 
time than anticipated for a school district to reach any 
specified state standard, particularly if the school dis- 
trict is substantially below the desired standard. In ad- 
dition, the authors state that a one-time increase in state 
aid would not be as effective as a gradual phase-in. 

Distinguishing Good Sehools From Bad in Prineiple and 
Praetiee: A Comparison of Four Methods. In this pa- 
per, Ross Rubenstein of the Maxwell School of Citi- 
zenship and Public Affairs at Syracuse University and 
Leanna Stiefel, Amy Ellen Schwartz, and Elella Bel 
Hadj Amor of the Robert E Wagner Graduate School 
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of Public Service at New York University compare four 
quantitative techniques to measure school performance. 
The distinctive aspect of this paper is that the authors 
estimate efficiency scores, particularly the identifica- 
tion of “good” and “bad” schools, using each of four 
methods based on the same data. They use New York 
City and Ohio school data that includes student char- 
acteristics, test scores, and school resources. They then 
explore how and why the methods and results differ. 
The four efficiency techniques explored in the paper 
are adjusted performance measures (APMs); data en- 
velopment analysis (DEA); education production func- 
tions (EPFs); and cost functions. 

The authors assert that one of the most difficult chal- 
lenges is comparing schools educating diverse students, 
particularly those without the necessary resources. 
Variations in student performance are highly corre- 
lated with students’ socioeconomic backgrounds. Ur- 
ban schools serving primarily minority and low-income 
students may appear to be low performing principally 
as a result of factors outside their control. What the 
authors wish to do is to find schools that make the 
most effective use of their limited resources, and ex- 
plore how they make use of those resources. 

Even using the same data and specifications, the au- 
thors demonstrate that different analytic methods may 
also produce different results. The different analytic 
methods, however, do produce lists of “good” and “bad” 
schools that are similar. And the use of more than one 
analytic method improves the accuracy of the identifi- 
cation of schools as either “good” or “bad.” The au- 
thors state that 

. . . simplistic measures of school performance, 
which do not account for the complex envi- 
ronment of schooling, risk identifying the 
wrong schools as either exemplars or in need 
of interventions. This problem is particularly 
critical when the performance measures are 
used to distribute rewards and sanctions. 

Court-Mandated Change: An Evaluation of the Effi- 
cacy of State Adequacy and Equity Indicators. In this 
paper, Jennifer Park and Ronald A. Skinner of Educa- 
tion Week explore the validity of their equity and ad- 
equacy grades for states’ school finance systems. They 
examine four states — New Hampshire, New Jersey, 
Vermont, and Wyoming — that have recently changed 



their school finance system as a result of losing a court 
challenge to the state education funding system. The 
authors compare the main equity and adequacy indi- 
cators before and after court-mandated changes were 
implemented. What the authors sought to do was to 
verify that these indicators were accurately portraying 
the changes that were occurring in these state educa- 
tion finance systems. 

In order to test the validity of the equity and adequacy 
measures in the four states, information regarding the 
school finance and litigation history of these four states 
was collected by the authors from court decisions, leg- 
islative changes to the state education funding system, 
and published studies. 

Park and Skinner report that the indictors for two of the 
four states they examined. New Hampshire and New 
Jersey, matched very well with actual change. Vermont 
and Wyoming’s changes were less clear, perhaps, they 
state, because the reforms in these states were imple- 
mented over several years and this analysis looked for 
changes in indicators in the single year where the most 
changes occurred. Some indicators were more accurate 
than others, and the authors suggest using a weighting 
system that more heavily weights the more accurate in- 
dicators. Because of the time lag in the availability of 
federal school finance data, the authors emphasize that 
current contextual information is important to consider 
when interpreting these equity and adequacy indicators. 

School Einance Reform and School Quality: Lessons Erom 
Vermont. In this paper, Thomas Downes of Tufts Uni- 
versity examines the changes in Vermont’s distributions 
of education spending resulting from the 1997 enact- 
ment of Act 60. Specifically, Downes examines whether 
the resulting changes in the distributions of spending 
have generated greater equality in measured student per- 
formance. Act 60, the “Equal Educational Opportu- 
nity Act,” may be the most radical reform of a state’s 
system of public school financing since the post-Serrano 
and post-Proposition 13 changes in California in the 
late 1970s, according to Downes. Prior to Act 60, Ver- 
mont used a traditional foundation formula to give towns 
state education aid. Act 60 created a combined founda- 
tion and power equalization plan that included a state- 
wide property tax. The changes, which Downes states 
were designed to shift some of the burden away from 
state residents to corporations and nonresident owners, 
were phased in over several years. 
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Examining local education spending inequality before 
and after enactment of Act 60, Downes finds that in- 
equality generally declined. Examining the relation- 
ship of spending and town wealth, he finds that 
wealthier towns did spend more prior to Act 60. How- 
ever, Downes asserts that care must be taken not to 
make too much of the declines in inequality, as they 
are small and not consistent. The small declines in the 
disparities in student performance would not, he states, 
justify a major policy change like Act 60. 

Downes concludes that Act 60 was a dramatic change 
in Vermont’s education funding, and that his analyses 
demonstrate a reduced range in education spending 
resulting from weakening the link between spending 
and property wealth. In addition, Downes tentatively 
concludes there is some evidence that student perfor- 
mance has become more equal since enactment of Act 
60, but the improvements have been small. 

Shopping for Evidence Against School Accountability. 
In this paper, Margaret E. Raymond and Eric A. 
Hanushek of the Hoover Institution at Stanford Uni- 
versity explore whether or not accountability is associ- 
ated with more gains in learning by students. The 
authority behind accountability has spread from states 
to also include the No Child Left Behind Act of 2001 
(NCLB), which mandates reporting and accountabil- 
ity through testing. Opponents, Raymond and 
Hanushek assert, aggressively search for evidence that 
testing and accountability are actually harmful to stu- 
dents. The existing evidence on state accountability 



systems, Raymond and Hanushek report, indicates 
that their use leads to better student achievement. 

Raymond and Hanushek use “The Nation’s Report 
Card” (National Assessment of Educational Progress) 
scores to demonstrate that states with their own “report 
cards,” which serve as a public disclosure, without sanc- 
tions and rewards, have gains of up to 1 .2 percent. Those 
states that disclose publicly and also have sanctions and 
rewards (which the authors call “accountability” systems) 
have a 1.6 percent increase in mathematics performance. 
However, the introduction of an accountability system 
also has unintended consequences, they state, such as 
cheating. These findings differ, however, from those of 
previous studies by other researchers. 

The conclusion reached by Raymond and Hanushek 
is that the media is unaware of or indifferent to qual- 
ity differences in the competing evidence on account- 
ability program performance. Since the media, many 
policymakers, and decisionmakers in education agen- 
cies serve a “gatekeeper” function for disseminating in- 
formation to the general public, evidence quality is of 
great importance, Raymond and Hanushek assert. 
When millions of dollars are involved, as they are in 
accountability systems, evidence must meet the high- 
est scientific standards, the authors declare, reliably 
controlling for rival alternative explanatory factors. 

Raymond and Hanushek conclude that no one yet 
understands how best to design accountability systems 
that can be directly linked to incentive systems. 



The papers in this publication were requested by the National Center for Education Statistics, U.S. Depart- 
ment of Education. They are intended to promote the exchange of ideas among researchers and policymakers. 
The views are those of the authors, and no official endorsement or support by the U.S. Department of 
Education is intended or should be inferred. This publication is in the public domain. Authorization to 
reproduce it in whole or in part is granted. While permission to reprint this publication is not necessary, please 
credit the National Center for Education Statistics and the corresponding authors. 
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This essay is adapted from articles in the winter 2004 
issue of Education Next’ and the spring 2004 issue of 
the Journal of Human ResourcesJ 

Introduction 

Experienced teachers are, on average, more effective 
at raising student performance than those in their early 
years of teaching.^ This gives rise to the concern that 
too many teachers leave the profession after less than 
a full career and that too many leave troubled inner- 
city schools for suburban ones. Until now, the roots 
of these problems have not been well understood. In 
particular, it is not known whether teachers leave 
schools with high concentrations of disadvantaged and 
low-achieving student populations for financial rea- 
sons or because of the working conditions associated 



with serving these students. Nor are there good esti- 
mates of what kinds of salary increases would need to 
be offered to slow the turnover among teachers. 

Significant policy decisions rest on understanding the 
market for teachers and the responsiveness of teachers 
to varying conditions of employment. This article sum- 
marizes our recent analysis of teacher decisions. The un- 
derlying technical article (Hanushek, Kain, and Rivkin 
[2004b]) provides detailed statistical estimates of 
teacher behavior. Here we distill the main findings with 
an eye toward the policy implications of that work. 

The chief obstacle to resolving these issues has been 
the difficulty of separating the effects of teachers’ salary 
levels from their working conditions and preferences. 
The outstanding suburban school that retains most of 
its teachers is likely to be attractive on a number of 
levels: the pay is good, students are high performing. 



' Hanushek, Kain, and Rivkin (2004a). 

^ Hanushek, Kain, and Rivkin (2004b). This article is the technical version of this essay. 

^ Our work on teacher quality differences finds that teachers in their initial years of experience perform significantly poorer than later in 
their career. This effect appears to be primarily a “learning effect” where teachers improve through on-the-job experience, as opposed to 
a compositional effect arising from the fact that many, possibly less skilled, early career teachers tend to exit teaching altogether. See 
Rivkin, Hanushek, and Kain (2001) and Hanushek and Rivkin (2004). 
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and parents are supportive. Since all three factors help 
in attracting and retaining teachers, it becomes diffi- 
cult to calculate the degree to which each factor sepa- 
rately affects a teacher’s decision to return to that school 
the following year. Conversely, the school that has dis- 
advantaged and low-performing students may suffer 
high rates of teacher turnover, but sorting out the causes 
of turnover is difficult. Doing so requires detailed in- 
formation for enough teachers and students to allow 
analysts to distinguish statistically among the various 
factors that affect teachers’ decisions. 

Fortunately, important parts of the necessary informa- 
tion are now available for elementary schools in the 
state of Texas for the years 1993 through 1996. Work- 
ing in cooperation with the Texas Education Agency, 
the University of Texas at Dallas’s 
Texas Schools Project has combined 
various data sets to create a database 
of key characteristics of both teach- 
ers and students during this period 
in all Texas public schools. This in- 
formation includes the race, ethnicity, 
and gender of both students and 
teachers; students’ eligibility for a 
subsidized lunch; and students’ per- 
formance on the Texas Assessment of 
Academic Skills (TAAS), a criterion- 
reference test administered each 
spring to students in grades 3 through 
8. The database also contains annual 
information about the teachers: their 
years of experience, their education and salary levels, 
the grades and subjects they teach, and the size of their 
classes. 

Our analysis of these data reveals that teachers trans- 
fer from one school to another — or exit the Texas pub- 
lic school system altogether — more as a reaction to the 
characteristics of their students than as a response to 
better salaries in other schools. This tends to leave dis- 
advantaged, low-achieving students with relatively 
inexperienced teachers. Because teachers appear so un- 
responsive to salary levels, it would take enormous 
across-the-board increases to stem these flows. Indeed, 
the results suggest that policymakers ought to con- 
sider only selective pay increases, preferably keyed to 
quality, for work in inner-city schools, together with 
efforts to improve the working conditions in these 
schools. 



Reasons for Leaving 

Teachers decide whether to remain at a school for a mul- 
tiplicity of reasons, which can be divided into four main 
categories: (1) characteristics of the job, including salary 
and working conditions; (2) alternative job opportuni- 
ties; (3) teachers’ own job and family preferences; and 
(4) school districts’ personnel policies. Although we were 
not able to look at the ways in which all of these factors 
affect teachers’ decisions with respect to their employ- 
ment situation, we were able to examine directly the 
impact of salary and certain working conditions. We were 
also able to draw some reasonable inferences about how 
family considerations and alternative job opportunities 
influence teachers’ decisions by examining how their 
choices differ by gender and experience. 

Admittedly, “working conditions” is 
a broad concept that can cover every- 
thing from class size to discipline 
problems to student achievement lev- 
els. Though we do not have observa- 
tional data on every aspect of teachers’ 
working conditions, we do know cer- 
tain characteristics of their students 
that many believe affect the teaching 
conditions at a school: the percent- 
age of low-income students at the 
school (as estimated by the percent- 
age eligible for a subsidized lunch), 
the shares of students who are Afri- 
can American or Hispanic, average 
student test scores, and class sizes. 
Whether these characteristics directly affect teachers’ 
decisionmaking or indicate other less tangible factors 
(such as the disciplinary climate or bureaucratic envi- 
ronment at the school) cannot be determined. 

When looking at the impact of working conditions on 
retention rates, one needs to take into account other 
factors that may affect teachers’ employment choices. 
Some teachers possess skills that are considered more 
valuable in the private sector employment marketplace. 
For instance, mathematics and science teachers may 
find more demand for their services in the private sec- 
tor than an English teacher would. However, our study 
focuses on elementary school teachers, who tend to 
have similar educational backgrounds and similar op- 
portunities outside the education system. As a result, 
differences in private sector alternative employment 



Teachers transfer from 
one school to another 
more as a reaction to 
the characteristics of 
their students than as 
a response to better 
salaries. 
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opportunities among teachers of different subjects 
should not be very important for this analysis. 

A more important consideration is that many teachers 
may wish to remain at a particular location for other 
than job-related reasons, perhaps out of a desire to 
live near their hometown or near their spouse’s work- 
place. Consequently, the availability of jobs in the lo- 
cality may be an important determinant of the 
probability of exiting a school, and we control for any 
systematic differences across regions within Texas. 

Retention rates can also be affected by the number of 
years a teacher has spent in a particular location. The 
more years working in a particular district, the more 
costly it becomes to leave, simply because pay, respon- 
sibilities, and job opportunities are 
often tied directly to experience 
within the same school district. The 
financial attractiveness of moving else- 
where also attenuates with the pas- 
sage of time. Because many districts 
credit a transferring teacher with only 
a limited number of years of experi- 
ence, teachers may have to take a sal- 
ary cut if they switch school districts. 

In general, switching careers grows 
costlier with age and experience. One 
must give up the higher salary that 
comes with experience within a par- 
ticular field, and the time to accu- 
mulate gains from any change in job 
or career grows shorter as one ages. For this reason, 
our analysis takes into account the number of years 
teachers have held their jobs by comparing only teachers 
with similar levels of experience. 

Other relevant differences among teachers may arise 
from their family circumstances, such as the job op- 
portunities of a spouse or a desire to stay home with 
young children or to enjoy the benefits of home own- 
ership. For example, many female teachers who leave 
teaching do so in order to leave the labor market alto- 
gether, often for family reasons. We unfortunately lack 
information on family structure, sources of income 
other than salary, the location or type of housing, and 
whether and where a spouse works. Flowever, we are 
able to look separately at teachers grouped by gender. 



giving us an opportunity to assess the extent to which 
female and male teachers are influenced by different 
school-related factors. 

Ethnicity may also affect decisionmaking. Teachers may 
prefer to teach in schools where they share the ethnic 
characteristics of the students, or they may find it easier 
to obtain a position if administrators prefer instruc- 
tors who have certain ethnic characteristics. To ascer- 
tain whether ethnic background affects teachers’ 
decisionmaking, we also look separately at White, Af- 
rican American, and Flispanic teachers. 

One limitation of our study is that we do not have 
direct information on school districts’ hiring and re- 
tention practices. Districts have options when hiring, 
and the willingness of a teacher to 
leave a position will depend on the 
availability of an attractive position 
elsewhere. Although few teachers are 
involuntarily separated from their 
jobs, we do not know whether a job 
change is determined primarily by a 
teacher’s decision or by that of the 
employer, and the circumstances un- 
doubtedly affect both opportunities 
and the range of choices a teacher will 
consider. Our lack of information 
about employer-initiated moves may 
lead to an underestimate of the im- 
provements in pay and working con- 
ditions achieved by teachers who 
move voluntarily, but the size of this underestimate is 
probably not very large. 

Movement Between and Within 
Districts 

Each year, approximately one-fifth of all teachers na- 
tionwide decide to leave the school at which they are 
teaching. The pattern in Texas is roughly the same as 
in the nation as a whole. On average, in each year be- 
tween 1993 and 1996, more than 18 percent of Texas 
teachers decided not to remain at the school at which 
they were teaching. More than 6 percent changed 
schools within their districts, another 5 percent 
switched from one district to another, and 7 percent 
left Texas public schools altogether. 



Each year, approxi- 
mately one-fifth of all 
teachers nationwide 
decide to leave the 
school at which they 
are teaching. 
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Let’s look first at the changes in salary typically expe- 
rienced by teachers moving to a new district. Instead 
of relying on salary data reported for each individual 
teacher, we calculate district average salaries for teach- 
ers in each of their first 10 years of experience during 
the period from 1993 to 1996. These averages are based 
on regular pay for teachers without advanced degrees 
and exclude extra pay for coaching or other activities. 
(The latter is not an important part of compensation; 
Over 85 percent of teachers receive no extra pay, and 
the median extra pay for those who do receive it is 
about $1,000 per year.) We use these averages to char- 
acterize the salary schedule of each district and then 
estimate the potential salary change resulting from a 
move, given the experience level of each teacher. For 
example, the salary change for a teacher who switches 
districts after 4 years of teaching is 
assumed to equal the average salary 
of fifth-year teachers in the new dis- 
trict minus the salary for that level 
of experience in the old district. 

On average, teachers who move be- 
tween districts after no more than 
2 years at a school improve their sala- 
ries, though just barely. Male teach- 
ers gain 1.2 percent in salary, while 
women gain 0.7 percent. Even these 
small gains begin to disappear for 
teachers with more experience. Over- 
all, the average annual salary gain 
among all teachers with less than 10 
years’ experience is 0.4 percent of annual salary, or 
roughly $100. Women with 3 to 9 years of experience 
who decide to change districts actually take, on aver- 
age, a small pay cut. In short, most teachers moving 
between districts do not receive substantially better pay 
in their new jobs. 

The picture for working conditions is quite different. 
There is strong evidence that teachers moving between 
districts have the opportunity to teach higher achiev- 
ing, higher income, nonminority students. The find- 
ings for achievement are the clearest and most 
consistent. The average job switcher moving from one 
district to another moved to a district whose average 
achievement was 0.07 standard deviations higher on 
the Texas Assessment of Academic Skills than that of 
the district the teacher left. (The difference is 3 per- 
centile points on a 100-point scale.) The shares of the 



district’s students who were African American, His- 
panic, or low income also declined significantly for 
movers. On average, the districts to which teachers 
moved had 2 percentage points fewer African Ameri- 
can students, 4.4 percentage points fewer Hispanic 
students, and more than 6 percentage points fewer 
low-income students (of any ethnicity) than the dis- 
tricts they had left. 

These patterns were even more pronounced for teach- 
ers who moved from urban to suburban districts. The 
salaries of such teachers actually declined by 0.7 per- 
cent, on average, as a result of their moves. Meanwhile, 
the average achievement in the new districts increased 
by 0.35 standard deviations (14 percentile points), and 
the shares of African American and Hispanic students 
decreased by 14 and 20 percentage 
points, respectively. Teachers who 
moved between different suburban dis- 
tricts experienced similar, albeit smaller, 
changes in student characteristics. Stu- 
dent achievement in their new districts 
was one-tenth of a standard deviation 
higher, while the percentages of Afri- 
can American, Hispanic, and economi- 
cally disadvantaged students all 
declined. 

We can gain further insight into the 
factors associated with teacher mobil- 
ity by examining the pre- and post- 
move school characteristics for 
teachers moving to a new school within the same dis- 
trict. These results confirm that teachers who move 
between schools within urban districts typically ar- 
rive at a school with higher average student achieve- 
ment (0.11 standard deviations) and a smaller 
percentage of minority and low-income students. In 
other words, those who choose to change schools 
within districts appear to follow the same attributes, 
seeking out schools with fewer academically and eco- 
nomically disadvantaged students. These patterns are 
also consistent with the notion that new teachers are 
often placed in the most difficult teaching situations 
and that senior teachers can often choose more com- 
fortable positions within the system. 

Important differences emerge, however, when we sepa- 
rate teachers by their own ethnic background. Afri- 
can American teachers tend to move to schools with 
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higher percentages of African American enrollment 
than their previous schools, regardless of whether they 
change districts or simply move to a new school in 
the same district. However, the average change in the 
percentage of Hispanic students for teachers of His- 
panic descent is not much different from the changes 
experienced by teachers as a whole. The typical gap 
in average test scores between their current and former 
school is also much smaller for African American and 
Hispanic teachers who have switched schools. 

It is not clear whether these ethnic differences are the 
result of teachers’ preferences or of the job opportuni- 
ties available to them. It could be that African Ameri- 
can teachers prefer to work at a school near where they 
live. If so, then residential segregation by race may 
lead to their selection of schools with more African 
American students. Or teachers may simply prefer to 
teach students of a similar ethnic background. Alter- 
natively, job opportunities for African American teach- 
ers may be more extensive in schools with higher 
proportions of African American students. 

All this movement of teachers among schools obvi- 
ously affects the composition of the teaching force at 
particular schools. Since exiting rates are smaller at 
schools with more advantaged students, these schools 
also enjoy more experienced teachers. The pattern is 



particularly striking when schools are grouped ac- 
cording to their average level of student achievement. 
As figure 1 shows, almost 20 percent of teachers in 
schools in the bottom quartile of student achieve- 
ment leave their schools each year, while only 15 
percent of teachers in schools in the top quartile leave 
their schools each year. The driving force of this rela- 
tionship is not simply teachers’ leaving urban dis- 
tricts for suburban ones; more of the difference in 
leaving rates between these types of schools is caused 
by teachers moving to new schools within their origi- 
nal district, and there are nontrival differences in the 
rates of leaving teaching entirely. Since teachers with 
fewer than 2 years of experience tend to be less effec- 
tive than more experienced teachers, existing mobil- 
ity patterns in Texas are likely to adversely affect the 
achievement of disadvantaged students. 

Salaries and Student Demographics 

The analysis to this point has not disentangled the 
effects of salaries from the effects of the working con- 
ditions associated with students of varying achieve- 
ment and family backgrounds. To identify more 
precisely the independent effects of the multiple fac- 
tors affecting teachers’ choices, we use regression 
analysis to estimate the separate effects of salary dif- 
ferences and school characteristics on the probability 



Figure 1. Yearly percentage loss of teachers, by school's performance on the Texas Assessment 
of Academic Skills (TAAS) and location of teacher's new position 




Schools in the bottom 
quartile of TAAS performance 



Schools in the top 
quartile of TAAS performance 



SOURCE: Authors. 
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that a teacher will leave a school district in a given 
year, holding constant a variety of other factors, in- 
cluding class size and the type of community (ur- 
ban, suburban, or rural) in which the district is 
located. We also compare the impact of salaries and 
school characteristics on the probability of switching 
to another district with their impact on the prob- 
ability of leaving teaching altogether. 

The results of this analysis confirm that teachers are 
more likely to leave districts with low average achieve- 
ment scores. Ethnic composition of the student body 
is also an important determinant both of the prob- 
ability of leaving the public schools entirely and of 
switching from one school district to another. White 
teachers, regardless of their teaching 
experience, will tend to move to 
schools with fewer African American 
and Hispanic students. Less experi- 
enced White teachers are also more 
likely to leave the public schools al- 
together if they come from schools 
with higher concentrations of Afri- 
can American and Hispanic stu- 
dents. African American and 
Hispanic teachers, however, do not 
show the aversion to concentrations 
of minority students. 

The differential effect of the ethnic 
composition of the student body for 
White and African American teachers could reflect 
personnel policies that prefer minority teachers in 
schools with higher concentrations of minority stu- 
dents. But teachers’ own preferences may be even more 
important, as suggested by the fact that the decision 
to leave the Texas public schools altogether — a deci- 
sion much more closely related to the individual 
teacher’s preferences than to the district — is influenced 
in the same way by the schools’ ethnic composition. 

Although the ethnic composition of the school is the 
most important factor affecting teachers’ decisions to 
change jobs, financial considerations are also relevant, 
especially when it comes to a decision by a male 
teacher to move from one district to another. For male 
teachers with fewer than 3 years of experience, the 
estimated change in the probability of switching dis- 
tricts for a 10 percent increase in salary is 2.6 per- 
centage points; for men with 3 to 5 years of experience. 



the estimated change for a salary increase of the same 
magnitude is 3.4 percentage points; for still more 
experienced male teachers, financial effects trail off, 
down to essentially zero for those with more than 20 
years of experience. 

The corresponding numbers for less experienced 
women teachers are less than half those for men. More- 
over, salary differences have no observable effects on 
women with 6 or more years of experience. The unre- 
sponsiveness of female teachers to salary increases is 
important in the subsequent policy discussion, since 
female teachers represent the vast majority of elemen- 
tary school teachers. 

Policy Implications 

The results presented above confirm 
the difficulty that schools serving 
academically disadvantaged students 
have in retaining teachers, particu- 
larly those early in their careers. 
Teaching lower achieving students is 
a strong factor in decisions to leave 
Texas public schools, and the mag- 
nitude of the effect holds across the 
full range of teachers’ experience lev- 
els. There is also strong evidence that 
a higher rate of minority enrollment 
increases the probability that White 
teachers will leave a school. By con- 
trast, increases in the shares of African American and 
Hispanic students reduce the probability that Afri- 
can American and Hispanic teachers will leave. 

Given these findings, a key question is how to reduce 
the flows out of low-achieving, high-minority schools 
and out of the teaching profession altogether. One oft- 
proposed solution is to provide teachers with “combat 
pay” — salary increments designed to encourage them 
to remain at a tough school. But how large would the 
increase need to be in order to neutralize the effects of 
difficult working conditions? Let’s consider this closely. 

The situation is complicated by the fact that most 
elementary school teachers in Texas are White females 
(only 20 percent are African American or Hispanic, 
while only 14 percent are male). As noted earlier, 
female teachers are less responsive to increases in sal- 
ary, meaning that the bonus required to keep them 
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at a school will be larger than for males. In addition, 
White teachers are the most likely to exit low- achiev- 
ing, high-minority schools, meaning that it will take 
even larger increases to retain them. If the teaching 
corps looked much different — say, if the teachers in 
urban elementary schools were mostly African Ameri- 
can and Hispanic males — the costs of the “combat 
pay” solution would be lower. 

Based on our findings of what causes teachers to leave 
their schools, we calculated the salary increases that 
would be necessary to offset the effects of difficult 
working conditions in large urban versus suburban 
schools."* These calculations, performed separately for 
White male and female teachers in their early careers, 
are shown in figure 2. The findings suggest that truly 
large boosts in salary would be needed, particularly 
among women. Female teachers in large urban school 
districts would require a 25 percent initial increase in 
compensation, rising to over 40 percent when they 
reach 3 to 5 years of experience. Moreover, this is only 



in the “typical” urban school. For the neediest or most 
troubled schools in urban areas, even the differentials 
calculated in figure 2 would probably not be suffi- 
cient to stem the high levels of turnover in such schools. 

Across-the-board salary increases of 25 to 40 percent 
for teachers in urban areas would be an enormously 
expensive reform, and, in addition, it would be diffi- 
cult to target such a solution, since teachers typically 
negotiate salary schedules that apply to all the teach- 
ers in the district, not just to those in the most disad- 
vantaged schools. Similarly, even if targeted to the most 
disadvantaged schools, any increases in salaries would 
almost certainly go to new and middle-career teachers 
alike, even though we know that salary differentials 
are nearly irrelevant for women teachers with 10 or 
more years of experience. 

At this time, we do not fully understand the working 
conditions that are most important, but we might 
speculate that at least a component involves school 



These calculations, described in Hanushek, Kain, and Rivkin (2004b), rely on the estimated effects of salary and student characteristics 
on the probability of leaving a school. From the differences in student characteristics for the average urban and average suburban school, 
we calculate the increased probability that a teacher (of a given gender and experience category) will leave urban schools. Then, based on 
the impact of salaries on exit rates, we calculate the salary premium needed to neutralize the effect of these adverse student characteristics. 



Figure 2. Increase in salary of nonminority urban teachers necessary to equalize turnover 
between urban and suburban schools, by experience and gender of teacher 



Percent increase in annual salary 




Teaching experience 



^ Males 
I I Females 



NOTE: Estimates based on the differences in average achievement and in the shares of African American and Hispanic students 
between large urban and suburban districts. 

SOURCE: Authors. 
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characteristics that are simply associated with the stu- 
dent characteristics we have identified. To the extent 
that other characteristics of schools where disadvan- 
taged students are found — such as safety and disci- 
plinary problems, more bureaucratic rules, poor 
leadership, greater student turnover, or a greater com- 
muting distance — are important elements, improv- 
ing these working conditions could mitigate the 
turnover problem we have identified. And these im- 
provements might even have a direct effect on stu- 
dent performance. 

Finally, it is important to note that this study focuses 
solely on how many teachers transition among schools 
and out of teaching. We have not examined the qual- 
ity of the teachers who move from one district to an- 
other or leave teaching altogether. The actual cost of 



improving the quality of instruction depends crucially 
on whether good teachers, not just experienced teach- 
ers, are being retained. Salary policies that are guided 
just by the characteristics of the students in a school 
will retain both the good and the bad teachers. 

We do know from our other work that differences in 
teacher quality are more significant than the differ- 
ences arising from having inexperienced teachers.^ 
Therefore, an approach with more appeal might be 
simply to accept the fact that there may be greater 
turnover in schools serving a larger disadvantaged popu- 
lation, but then to concentrate much more attention 
and resources on the quality dimension. While we do 
not have much experience with such policies, they seem 
like the most feasible way to deal with the problems of 
schools serving low-income and minority students. 



^ Rivkin, Hanushek, and Kain (2001) establish lower bounds on the variation in teacher quality that is found in elementary schools in 
Texas. That analysis suggests that perhaps 10 percent of the variation in quality arises from experience effects that will change as teachers 
pass their initial period of teaching. 
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If school finance has seasons, then these are bitter 
cold days of winter for districts everywhere, even 
those in the sunniest ofclimatesd 

The uncertainty surrounding the current economic 
climate has an immediate and direct impact on edu- 
cation fiscal policy and practice. The nation’s large 
urban education systems, serving high numbers of low- 
income, minority children, seem particularly vulner- 
able to programmatic cuts as states and localities 
respond to an environment of reduced revenues and 
huge budget deficits. Paradoxically, it is precisely these 
types of school systems that are the primary subjects 
of increased standards and accountability as 
policymakers and school officials seek to close the 
achievement gap between low-income, minority stu- 
dents and wealthier, predominantly White students. 

Many urban districts across the country have invested 
heavily to meet new federal student achievement goals 
contained in the No Child Left Behind Act. These 
same districts now face the challenge of meeting these 
mandates with fewer resources. Despite previous ef- 
forts by school officials to avoid classroom-related cuts. 



it appears likely that the current fiscal crunch will have 
a direct impact on instructional services and supplies. 
City school systems, for example, are proposing to 
balance their budgets through increased class sizes, em- 
ployee furloughs, shortened school years, and freezes 
on textbook and furniture purchases (Gewertz and Reid 
2003). These sorts of strategies to accommodate 
shrinking budgets have the potential to severely re- 
duce the likelihood that these school systems will make 
progress toward meeting the increasingly high perfor- 
mance standards being set by states. 

The current education policy emphasis on higher per- 
formance standards, school-level accountability, and 
market-oriented reform presents important research 
challenges within the field of school finance and the 
economics of education. The simultaneous pursuit of 
both equity and efficiency within this policy context 
creates an unprecedented demand for rigorous, timely, 
and field-relevant research on fiscal practices in schools. 

In an effort to help meet this demand, we have devel- 
oped a new book series. Research in Education Fiscal 
Policy and Practice. For our inaugural volume. Fiscal 



' Gewertz, C., and Reid, K.S. (2003, February 5). Hard Choices: City Districts Making Cuts. Education Week, 22(21), 1. 
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Policy in Urban Education (Roelike and Rice 2002), 
we assembled a group of school finance experts to ad- 
dress the critical challenges in urban education fiscal 
policy and practice. These researchers contributed a 
diverse set of policy papers, including analyses of fis- 
cal accountability in urban schools, teacher recruit- 
ment and quality, private influences on urban school 
funding, and other pressing concerns in urban educa- 
tion reform.^ While many of the issues addressed by 
the analyses are applicable to all schools — urban, sub- 
urban, and rural — we intentionally elected to focus 
on questions of both policy and practice that impact 
schools located in the nation’s inner cities. Our intent 
here is to synthesize the key themes that surfaced from 
this research effort and to highlight compelling issues 
in need of further examination.^ While 
by no means exhaustive, the sections 
below outline some of the most cur- 
rent and critical issues facing urban fis- 
cal policy. 



The Complexity of Urban 
School Reform 



Education reform in 
urban contexts is 
extremely complex, 
even in the best of 



times. 



The chapters in our volume make it 
very clear that education reform in ur- 
ban contexts is extremely complex, 
even in the best of times. For decades, 
urban schools have been the object of 
one intervention after another, exem- 
plary of Cuban’s phrase, “reforming 
again, again, and again” (1990). Today’s policy cli- 
mate is characterized by widespread intolerance to- 
ward schools with a history of failure. Low-performing 
schools, often located in urban areas that serve large 
numbers of poor and minority children, report chroni- 
cally low attendance, achievement, and graduation 
rates. Many of these schools have been the focus of 
persistent reform efforts, yet satisfactory levels of stu- 
dent performance have remained elusive. Moreover, 
even when progress is evident, it is often not at a pace 



to keep up with state demands to achieve higher stan- 
dards as measured by state-mandated tests (Corbett 
and Wilson 1991; Ladd 1996). 

The educational policy terrain continues to be flooded 
with options like class-size reduction, alternative 
scheduling, summer enrichment, early intervention 
programs, as well as a wide array of whole-school re- 
form models. Successful implementation of these ini- 
tiatives is highly dependent on a variety of contextual 
factors, including student demographics, fiscal ca- 
pacity, school size, spending level, and district/school 
governance (latarola, Stiefel, and Schwartz 2002; 
Brent, Roellke, and Monk 1997). This is certainly 
the case in all schools, but is especially problematic 
for urban educators, who confront 
a high concentration of at-risk stu- 
dents and a wide diversity of com- 
peting reforms that focus on at-risk 
youth. Further, researchers offer 
little definitive evidence on the cost 
or the effectiveness of alternative in- 
vestment options, information that 
could be very helpful to local 
policymakers facing a multitude of 
policy alternatives and a limited 
stock of resources.^ 



While many school reform strate- 
gies have been targeted to urban 
schools, a great deal of attention in 
education reform circles has focused on comprehen- 
sive school reform models as promising alternatives 
for improving the effectiveness of underperforming 
schools serving large concentrations of at-risk stu- 
dents. This approach to reform is attractive in that 
each model prescribes a “configuration of resources” 
that are intended to have a positive effect on the en- 
tire educational experience of students during their 
elementary school years (Rice 2001). Policies encour- 
aging schools to adopt comprehensive school reform 



^ For additional details of these research studies, see Roellke and Rice (2002). 

^ This paper draws from the final summary chapter of the volume, Rice and Roellke (2002), and from Rice and Roellke (2003). 

Aseries ofliterature reviews by Hanushek (1981, 1986, 1996, 1 997) have shown a high level of inconsistent and insignificant findings across 
studies estimating the impact of various types of educational investments. On the other hand, researchers who have reanalyzed Hanusheks 
data, challenging both his assumptions and his basic “vote counting” methodology, have reported more positive and consistent interpretations 
of the same set of studies (Hedges, Laine, and Greenwald 1994; Laine, Greenwald, and Hedges 1996; Krueger 2002). On the cost side, 
Levin and McEwan (2001) and Rice (1997, 2001) argue that cost analysis is an underutilized analytic tool in the field of education. 
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models are evident at the federal, state, and local lev- 
els, and much of this attention has focused on urban 
schools and school systems. 

Research that examines both the implementation pro- 
cess and achievement effects of these comprehensive 
reform strategies has yielded mixed results.^ Method- 
ological challenges, including contextual differences 
across model sites and the lack of randomized experi- 
ments, make it difficult to draw definitive conclusions. 
In addition, these comprehensive reforms must be 
studied longitudinally with a specific need for careful 
cost-effectiveness studies (Bifulco 2002). 

Two chapters in our volume on fiscal policy issues in 
urban education address key issues related to compre- 
hensive school reform. Bifulco reviews what the existing 
evidence shows to draw conclusions about the degree to 
which whole-school reform models can be expected to 
enhance the productivity of schools. His focus is on three 
models that have been implemented and studied: (1) 
The School Development Program, (2) Success for All, 
and (3) Accelerated Schools. In terms of effectiveness, 
he finds some evidence of positive effects of the Success 
for All models, but recognizes that the studies report- 
ing the strongest positive effects were conducted by pro- 
gram developers. Bifulco concludes, “The School 
Development Program does not appear to have positive 
impacts on average. However, it may have positive im- 
pacts in some places under certain conditions” (p. 29). 



Finally, the evidence on Accelerated Schools, Bilfulco 
concludes, is insufficient to draw general conclusions 
about the program’s effectiveness. 

In terms of the costs, Bifulco provides a summary table 
(table 1) of cost estimates generated by a handful of 
researchers who have studied the costs of these three 
models. While the range of costs for any given model 
is substantial, the Success for All program is generally 
the most resource-intensive of the three approaches. 
This is due to the highly prescriptive nature of this 
model in terms of new positions and training require- 
ments. Bifulco concludes that “the relatively large 
amount of resources demanded by Success for All raises 
questions about whether its positive impacts are the 
result of increased productivity at SFA schools or 
merely increased resources. The answer to this ques- 
tion hinges on whether the resources used to imple- 
ment SFA represent additional resources or a 
reallocation of existing resources” (p. 30). 

Bifulco concludes that “no model can improve student 
outcomes in all schools . . . More research is needed to 
identify the conditions under which particular whole- 
school reform models are most likely to succeed” (p. 
30). He recommends four directions for future research 
on whole-school reform models: evaluations of more 
models, additional cost studies, estimates of long-term 
impacts, and efforts to identify factors that promote 
the success of whole-school reform. 



^ For a review of the evidence of comprehensive school reform models on student achievement, see Herman (1999). 



Table 1 . Estimates of costs for three whole-school reform models 



Success for All School Development Program Accelerated Schools 

Study (in dollars) (in dollars) (in dollars) 

King (1994)' 261,060-646,500 102,800-278,150 48,000-266,000 

Barnett (1996)2 160,500-340,500 57,500-219,000 17,000-80,000 

Herman (1999)2 270,000 45,000 27,000 

Borman and Hewes (2001)'* 153,293-578,550 — — 

— Not available. 

' Estimates assume school with 500 students, and do not include costs for materials. 

2 Estimates assume school with 500 students, and do not include costs of parental time. 

2 Estimates are based, in part, on King (1994) and Barnett (1996). 

''Estimates for actual schools of varying sizes. Estimates do not include extra time devoted by existing staff for training and 
implementation activities. 

SOURCE: Bifulco (2002). 
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Another chapter in the volume underlines the idea 
that successful adoption of whole-school reform re- 
quires a careful, inclusive model selection process and 
ongoing support and guidance from model develop- 
ers (Erlichson and Goertz 2002; Hertling 1999). 
Erlichson and Goertz report findings from a study 
examining 3 years of implementation of whole-school 
reform and school-based budgeting in New Jersey. 
The authors examined the response of schools and 
school districts to Abbott v. Burke, the May 1998 New 
Jersey Supreme Gourt decision that directed schools 
in 30 poor urban districts to adopt comprehensive 
school reform programs. “In practical terms, the 
Gourt’s mandate required nearly 450 schools in New 
Jersey to adopt such models in a three-year period as 
the primary way of addressing the special education 
needs of urban students” (p. 37). The Gourt also re- 
quired that each school adopt school-based budget- 
ing concurrently with the comprehensive school 
reform model. 

One feature of this decision was that schools could 
choose among a variety of reform models. Erlichson 
and Goertz provide a breakdown of which models were 
selected and when they were implemented during the 
3-year period (table 2). 



Based on data collected through site visits, interviews, 
and questionnaires, Erlichson and Goertz draw a 
number of conclusions regarding the implementa- 
tion of these court-ordered reforms. Regarding the 
implementation of whole-school reform models, the 
authors identified six shortcomings that undermined 
the process: 

(1) Flawed model selection process — particularly 
inadequate information regarding the differ- 
ent options 

(2) Mismatch between expectations of a model 
and reality — a direct result of the flawed model 
selection process 

(3) Absence of a link to core curriculum content 
standards — standards that were established by 
the state and for which schools are held ac- 
countable 

(4) Lack of time — to train, plan, and establish a 
new vision to implement the model 

(5) Lack of consistent and meaningful support 
from developers — resulting, in part, from in- 
sufficient model staff to accommodate the in- 
flux of New Jersey schools 



Table 2. Statewide implementation of comprehensive whole-school reform models: By model 
and cohort (year of implementation) 


Cohort: (Number of schools implementing each model is reported for each cohort) 


Model 


One 


Two 


Midyear Second Three 


Midyear Third 


Total 


Year of implementation 


1998-1999 


1999-2000 


2000-2001 


2000-2001 


2001-2002 




Accelerated Schools Program 


1 


14 


10 


1 


0 


32 


America's Choice 


0 


6 


3 


8 


4 


21 


Atlas 


0 


0 


0 


1 


0 


1 


Coalition of Essential Schools 


3 


3 


8 


32 


13 


59 


Community for Learning 


23 


8 


3 


2 


0 


36 


Co-Nect 


0 


7 


4 


19 


3 


33 


High SchoolsThat Work 


0 


0 


0 


2 


6 


8 


Microsociety 


0 


0 


1 


0 


0 


1 


Modern Red Schoolhouse 


2 


5 


0 


2 


0 


9 


Paideia 


0 


1 


0 


2 


1 


4 


School Development Program (Comer) 


16 


13 


7 


76 


6 


118 


Success for All/Roots and Wings 


27 


23 


5 


13 


1 


69 


Talent Development 


0 


1 


0 


1 


9 


11 


Ventures 


0 


2 


0 


1 1 


6 


19 


Alternative Program Design (approved) 


0 


0 


0 


6 


7 


13 


Total whole-school reform schools 


72 


83 


41 


182 


56 


434 


SOURCE: Erlichson and Goertz (2002). 
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(6) Lack of meaningful and consistent support 
from New Jersey Department of Education 
field staff — a team was assigned to every school 
implementing whole-school reform 

In the end, the authors conclude with three lessons 
from the New Jersey case. First, the success of man- 
dated school reform depends on both the will and skill 
of educators across the levels of the system, but par- 
ticularly in the targeted schools. Second, the success- 
ful implementation of reform can be undermined by 
complex governance structures, unclear roles, and con- 
flicting messages. Finally, the multiple 
actors in the education system can 
limit the degree of school empower- 
ment that is realized, even in cases 
when the reform is designed to give 
schools greater discretion in how they 
approach reform. 

Teacher Supply and 
Quality in Urban Schools 

High-quality teachers are a fundamen- 
tal resource needed to realize the high 
standards characteristic of most state 
accountability plans. While research- 
ers have debated the extent of the 
teacher supply problem nationally, there is general agree- 
ment that schools serving large numbers of poor and 
minority students face the greatest challenges in recruit- 
ing and retaining a faculty of qualified teachers. Fur- 
ther, urban schools tend to serve high concentrations of 
students with these characteristics, making teacher qual- 
ity issues a key concern in urban education. 

While recruiting high-quality teachers to urban, 
difficult-to-staff schools is a challenge, retaining those 
teachers over time is just as critical (Allgood and Rice 
2002). Recent research on the New York teaching 
workforce, for example, found that quit rates in the 
state are highest in New York City. In addition to these 
high quit rates, evidence suggests that teachers who 
leave New York City schools generally possess better 
qualifications than those who remain (Lankford, Loeb, 
and Wyckoff 2002). These attrition patterns are con- 
sistent with earlier claims made that one out of every 
five New York City teachers leave after the first year 



and one of every three teachers leave after 3 years 
(Schwartz 1996). Resources aimed at elevating teacher 
quality in urban schools should be targeted not only 
toward drawing high-quality teachers to difficult-to- 
staff schools, but also toward reducing teacher turn- 
over in those schools. 

Two chapters in our volume on fiscal policy issues in 
urban education address the issue of teacher attri- 
tion in urban districts. Theobald and Michael exam- 
ine teacher attrition among novice teachers over 5 
years in four midwestern states (Ilinois, Indiana, Min- 
nesota, Wisconsin). They examine 
four categories of novice teachers, 
“those who: (a) taught continuously 
in the same district all 5 years 
(‘stayers’), (b) transferred to another 
school district(s) in the state, but 
remained in the state for all 5 years 
(‘movers’), (c) left public school 
teaching in a state and did not re- 
turn (‘leavers’), and (d) left public 
school teaching in a state, but re- 
turned (‘returnees’).” Figure 1 shows 
the percentage of teacher turnover 
in 5 years among the 11,787 teach- 
ers who entered the profession in the 
four states in 1995—96. As can be 
seen from this figure, leaving teaching is related to 
certain personal characteristics such as race, age, and 
level of education. 

Figure 1 also provides similar information about the 
3,194 novice urban teachers who entered the profes- 
sion in the 1995—96 school year. These data reveal 
that “urban teachers — regardless of their gender, race, 
age, or degree status — are significantly more likely to 
move out of their district than are novice teachers hired 
by non-urban districts” (Theobold and Michael 2002, 
p. 144). The findings here underline the importance 
of including movers in this sort of analysis, since the 
results for leavers alone would lead to conclusions of 
little difference between novice teachers hired by ur- 
ban districts and those hired by nonurban districts. 

Theobald and Michael also present interesting find- 
ings regarding teacher attrition among novice teach- 
ers by level of education and subject area. Figure 2 



Schools serving large 
numbers of poor and 
minority students face 
the greatest chal- 
lenges in recruiting 
and retaining a faculty 
of qualified teachers. 
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reports these findings for all novice teachers hired dur- 
ing the 1995—96 academic year, and for novice urban 
teachers hired during that school year. Figure 2 shows 
that, among all novice teachers, mathematics teachers 
and science teachers (except for biology) are more likely 
to leave teaching and are less likely to transfer among 
school districts. This finding could be explained by 
the alternatives available to these individuals in the 



broader labor market. The urban data presented in 
figure 2 reveal that urban teachers in special educa- 
tion, business, foreign language, and mathematics are 
more likely to move to other districts than their coun- 
terparts who were hired in nonurban districts. Again, 
a focus only on leavers would lead to a conclusion of 
little difference in teacher turnover between urban and 
nonurban schools. 



Figure 1. Percentage of teacher turnover and percentage of urban teacher turnover in 5 years 
among teachers who entered the profession in four midwestern states in 1995-96: By 
selected personal characteristics 



Percent of teacher turnover^ 

All teachers | 

Male 

Female 

Minority 

White 

Less than or equal to 30 years old on entry 
Greater than or equal to 31 years old on entry 

Bachelor's degree 
Graduate degree 

Percent of urban teacher turnover^ 

All urban teachers 

Male 

Female 

Minority 

White 

Less than or equal to 30 years old on entry 
Greater than or equal to 31 years old on entry 

Bachelor's degree 
Graduate degree 



0 



281 




28 



28l 



28 



29 



IT 

T4l 



28l 



10 



20 



30 40 50 

Percent 



60 



28l 






70 



^ Movers 
I I Leavers 



80 



' Among 11,787 teachers. 

^ Among 3,194 teachers. 

SOURCE: Theobald and Michael (2002). 
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Figure 2. Percentage of teacher turnover and percentage of urban teacher turnover in 5 years 
among teachers who entered the profession in four midwestern states in 1995-96: By 
selected professional characteristics 
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' Among 11,787 teachers. 
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SOURCE: Theobald and Michael (2002). 
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Theobald and Michael conclude their chapter with a 
set of policy recommendations. These include provid- 
ing more funding to school districts serving dispro- 
portionately large concentrations of disadvantaged 
students, creating an external context in all school dis- 
tricts that is supportive of novice teachers, and pro- 
viding pay premiums for novice teachers in districts 
that face high turnover rates. 

In another chapter, Jennifer Imazeki examines teacher 
attrition in Wisconsin, paying particular attention to 
the role of wage differentials. Her data suggest that teach- 
ers do respond to wages; in other words, increasing 
teacher salaries can help to slow attrition. She also ad- 
dresses the question. How much do salaries need to in- 
crease? To answer this question, Imazeki simulates several 
wage-increase scenarios involving a $5,000 salary in- 
crease. Table 3 presents her results for Milwaukee. 
Imazeki points out that even without the $5,000 wage 
differential, Milwaukee teacher salaries are higher than 
those elsewhere in the state (ranging from one-half of a 
standard deviation for beginning teachers to one and a 
half standard deviations for maximum salaries). 

When $5,000 is offered in addition to current salaries, 
the effect depends on the scenario. Exits are most re- 
sponsive to an increase for all novice teachers, while trans- 
fers are responsive to changes in relative salary (i.e., 
targeted to Milwaukee). In general, for both men and 
women, $5,000 is not enough to bring overall attrition 
rates in Milwaukee to the levels of the average district 
in Wisconsin. While transfer rates reach the level of the 
average district, exit rates remain higher in Milwaukee. 



One explanation for the inability of financial incen- 
tives to solve the problem is that challenging work- 
ing conditions further complicate the low supply and 
high attrition rates of urban teachers. Urban teach- 
ers, for example, educate a disproportionate number 
of students with special needs and confront the high- 
est percentages of students who are not proficient in 
English. Urban teachers are much more likely than 
suburban and rural teachers to report problems such 
as high absenteeism, serious student violence, and 
poor parental involvement (Lippman, Burns, and 
McArthur 1996; Imazeki 2002; Van Horn 1999). 
As a result, compensation-based strategies alone are 
not likely to be sufficient to attract and retain high- 
quality teachers in urban schools. Policies must at- 
tend to the working conditions in these schools, in 
addition to providing targeted salary incentives, if 
they are to realize a high-quality urban teaching 
workforce (Allgood and Rice 2002). 

Addressing concerns about the teacher workplace and 
professional climate requires moving beyond state- 
level and even district-level reform strategies. Because 
most recruitment and induction activities occur 
within local schools and classrooms, it is important 
that policymakers at more centralized levels of the 
system be attentive to varying local capacities. Solu- 
tions to the teacher supply and quality problem in 
our inner cities may require a more concerted effort 
to support creative, locally designed strategies to en- 
hance the professional environment of emerging edu- 
cators (Roellke and Meyer 2003; Theobald and 
Michael 2002). 



Table 3: Survivor functions for Milwaukee: Simulations with $5,000 added to salaries 







A 




B 




C 




D 




E 






Duration 

ofteaching 

spell 

(years) 


Actual Salary 


$5,000 for all 
beginning teachers 


$5,000 for 
beginning teachers 
in Milwaukee 


$5,000 for all 
experienced 
teachers 


$5,000 for all 
teachers in 
Milwaukee 




Women 


Men 


Women 


Men 


Women 


Men 


Women 


Men 


Women 


Men 


Exits 




1 


81.1 


83.6 


83.7 


85.9 


82.4 


83.2 


81.6 


85.5 


82.9 


85.1 




3 


59.4 


66.8 


63.9 


70.7 


61.6 


65.8 


60.4 


70.1 


62.5 


69.3 




5 


50.1 


56.6 


54.9 


61.2 


52.3 


55.4 


51.1 


60.6 


53.3 


59.4 


Transfers 




1 


97.2 


94.9 


97.3 


95.9 


97.9 


96.2 


97.4 


95.1 


98.1 


96.3 




3 


93.1 


88.1 


93.3 


90.2 


94.9 


90.9 


93.7 


88.6 


95.3 


91.3 




5 


89.5 


84.3 


89.8 


86.9 


92.1 


87.9 


90.4 


84.8 


92.8 


88.3 



SOURCE: Imazeki (2002). 
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Nontraditional Support for Urban 
Schools 

Most analyses of fiscal education policy tend to focus 
on revenues generated by local, state, and federal gov- 
ernments. However, school leaders and policymakers 
are taking advantage of a broader resource base than is 
typically recognized in more traditional fiscal analyses. 
Fiscal and personnel support for students in urban 
schools is also derived from nontraditional sources, in- 
cluding private foundations, volunteer networks, and 
other human service agencies. Two chapters in our vol- 
ume address these nontraditional resources, which can 
provide for a variety of urban school services, including 
tutoring, vocational counseling, literacy programs, and 
even teacher training (Schwartz, Amor, and Fruchter 
2002). Schwartz and her colleagues use a combination 
of school budget data and survey data to find that the 
vast majority of public schools receive some sort of non- 
traditional or private support, or both. Table 4 illus- 
trates that for New York City the amount of such 
support varies significantly across schools, reflecting the 



interplay of donor preferences, the ability of school lead- 
ers to solicit funds, and the political economy on non- 
entitlement government funding. 

In some cases, these nontraditional resources can ac- 
count for over half of the financing of children’s services 
within urban school districts (Ficus et al. 2002). Ficus 
and his colleagues take into consideration not only school 
budgets, but also other public spending associated with 
children’s services. Table 5 provides both low and high 
estimates of this spending for health and social services 
in the University of Southern California area. 

These findings imply that private and nontraditional 
resources devoted to education and other children’s 
services are not trivial, and therefore should be more 
common in fiscal analyses of education. Further, the 
research indicates a substantial level of variability in 
the amount and the distribution of nontraditional re- 
sources, which has clear implications for both equity 
and efficiency of urban schools and school systems. 
Finally, the capacity of schools to adopt certain reforms 



Table 4. Nontraditional revenues in New York City public schools (N = 1,023 schools) 



Variable 


Mean 


Minimum 


Maximum 


Private support per pupil (in dollars) 


59.0 


1.0 


1,901.7 


State grants per pupil (in dollars) 


104.3 


8.4 


2,402.7 


Federal grants per pupil (in dollars) 


52.5 


1.9 


820.3 


Other grants per pupil (in dollars) 


132.9 


68.5 


386.3 


Total per pupil (in dollars) 


348.4 


104.9 


2,873.6 


Total spending per pupil (in dollars) 


8,245.7 


1,673.1 


22,414.4 


Percent White 


15.7 


0.0 


93.8 


Percent Black 


36.7 


0.2 


97.6 


Percent Hispanic 


37.5 


1.3 


98.1 


Percent Asian 


10.2 


0.0 


94.3 


Percent female 


49.1 


6.0 


87.9 


Percent immigrant 


8.2 


0.0 


96.3 


Percent limited English proficient (LEP) 


15.9 


0.1 


100.0 


Percent free lunch 


71.7 


5.9 


100.0 


Percent special education 


6.1 


0.0 


37.7 


Enrollment 


1,001.5 


51.0 


5,004.0 



NOTE: Private support per pupil includes contributions of cash, equipment, and services. State grants per pupil include legislative 
grants, magnet grants, and Comprehensive Instructional Management System grants. Federal grants per pupil include magnet 
grants and federal bilingual program (Title 7) grants. Other (non-entitlement) grants per pupil include capital projects, building 
Board of Education/Office for Development maintenance, student information services, early grade paraprofessionals 
redeployment, self-sustaining accounts. Employment Prep Education Program, city-funded programs, and food services. The total 
is the sum of the above. Total spending per pupil includes direct services to schools, district and superintendency costs, and 
systemwide costs (no pass-throughs). Direct services to schools include classroom instruction, instructional support services, 
school leadership, ancillary support services, building services, and district support; district/superintendency costs include 
instructional support/administration and other district/borough costs; systemwide costs include central instructional support, 
central administration, and other obligations. 

SOURCE: Schwartz, Bel Had] Amor, and Fruchter (2002). 
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may depend on the availability of nontraditional sources 
of support. For instance, since several whole-school 
reform designs rely heavily on the use of volunteers 
and community resources, policymakers must be at- 
tentive to the availability (or lack thereof) of this im- 
portant pool of nontraditional and private support. 

As market-oriented reforms gain momentum, it is clear 
that additional attention to these nontraditional revenues 
and data on private school finance are needed. The choice 
provisions contained within the federal No Child Left 
Behind Act illustrate this ongoing interest in experiment- 
ing with market-based mechanisms as a means for re- 
forming public education. This is also evident in the 
privatization of public schools in several urban districts 
like Philadelphia and Baltimore, the expansion of char- 
ter schools throughout the country, and the increasing 
attention being given to vouchers as a means to allow 
students enrolled in failing public schools to elect to 
attend private schools. We know very little, for example, 
about the manner in which private schools secure, allo- 
cate, and use educational resources (Brent 2002). 

Accountability and Adequacy in 
Urban Schools 

The current policy climate characterized largely by 
high-stakes accountability introduces a variety of 
monitoring and assessment issues. These issues are 



especially salient for urban schools where student per- 
formance is often a concern. While much of the at- 
tention associated with monitoring school systems 
has focused on testing, broader accountability goals 
require more sophisticated analyses that assess fiscal 
condition and capacity, opportunity to learn, and 
other equity-related issues. It is clear that urban dis- 
tricts struggle to meet the demands of fiscal account- 
ability and increased standards for student 
achievement (Alexander 2002). 

This gap between higher achievement standards and 
the resources required to reach them has resulted in a 
series of legal challenges focused on adequacy. Often 
referred to as the “third wave” of school finance litiga- 
tion, plaintiffs argue that finance formulas prevent 
poor school districts from providing an adequate edu- 
cation as defined by state education clauses.^ An im- 
portant and recent third wave victory for plaintiffs is 
the New York Court of Appeals ruling in Campaign 
for Fiscal Equity v. State of New York (2003). The 4-1 
decision overturned a 2002 state appellate court rul- 
ing that the state was responsible for providing only 
an eighth- or ninth-grade education. The higher court 
ruled that a “sound basic education” goes beyond 
eighth- or ninth-grade, and should include a “mean- 
ingful high school education.” The remedy laid out 
by the Court of Appeals requires a costing-out study 
(to be completed by March 2004) to determine a 



For a more detailed discussion of school finance litigation, see Roellke, Green, and Zielewski (in press). 



Table 5. Total estimated health and social service expenditures in the University of Southern 
California area 


Service provider 


Amount (in dollars) 


Percent of total 


Low estimate 






County of Los Angeles 


223,833,900 


55.16 


City of Los Angeles 


15,403,054 


3.80 


Not-for-profit agencies 


48,766,800 


12.02 


Los Angeles Unified School District 


117,818,860 


29.03 


Total 


405,822,614 




High estimate 






County of Los Angeles 


223,833,900 


40.77 


City of Los Angeles 


15,403,054 


2.81 


Not-for-profit agencies 


48,766,800 


8.88 


Los Angeles Unified School District 


260,994,1 10 


47.54 


Total 


548,997,864 




SOURCE: Picus, McCroskey, Robillard, Yoo, and Marsenich (2002). 
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dollar amount that can ensure that all students have 
the opportunity to obtain this higher level of achieve- 
ment specified by the Court. 

Concluding Remarks 

The pressure for urban schools to operate efficiently, 
equitably, and adequately is unprecedented. As we in- 
dicated in our introduction, this challenge has become 
even more daunting in light of dramatic budget cuts 
for urban schools. Solutions to the problems of urban 
schools are not easily answered by research. We are 
confident, nonetheless, that emerging studies on whole- 
school reform, teacher supply and quality, nontradi- 
tional revenues, high-stakes accountability, and other 
pressing fiscal policy issues can assist policymakers and 



school leaders in their quest for increased equity and 
enhanced productivity in urban schools. The second 
volume of our series, guest edited by Faith Crampton 
and David Thompson, focuses on school infrastruc- 
ture funding in both the United States and Canada.^ 
A diverse set of school finance and policy analyses are 
included in this second volume, including capital needs 
in urban and rural schools; specific infrastructure con- 
siderations for students with disabilities; school finance 
litigation as a strategy for improving school facilities; 
and the role of school administrators in school renova- 
tion projects. Our goal is that these volumes, along 
with others that follow in the series, can assist aca- 
demic researchers, policymakers, and school practitio- 
ners in their efforts to improve education fiscal policy 
and practice. 



^ See Crampton and Thompson (2003). 
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Introduction 

Improving the quality of primary and secondary edu- 
cation is a top priority of the Bush administration and 
of members of both parties in Congress. In introduc- 



ing his education proposals to Congress, the president 
highlighted the “academic achievement gap” that ex- 
ists between students from rich and poor families and 
between White and minority children (U.S. Depart- 
ment of Education 2001). As evidence of the failure of 
the current system, the president cited the fact that 
“nearly 70 percent of inner city fourth graders are un- 
able to read at a basic level on national reading tests.” 
The primary goal of the administration’s education 
policy is to close this achievement gap and to ensure 
that, in President Bush’s oft-repeated phrase, “no child 
should be left behind.” 

To that end, the Bush administration proposed, and 
after extensive debate. Congress enacted, the No Child 
Eeft Behind Act of 2001. The new legislation man- 
dates annual testing of all students in grades 3 through 
8 and requires that schools make annual progress in 
meeting student performance goals for all students 
and for separate groups of students characterized by 
race, ethnicity, poverty, disability, and limited En- 
glish proficiency (U.S. Department of Education 
2002). The underlying premise of the legislation is 
that schools must be held accountable for the aca- 
demic performance of their students. The legislation 
will reward schools that succeed in meeting state- 
imposed achievement goals and will sanction schools 
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that fail. The intent is that all students, but espe- 
cially students from disadvantaged backgrounds, show 
annual improvements in their academic performance 
as measured against state standards. Measuring stu- 
dent performance is thus a necessary component in a 
policy designed to improve the quality of education. 
We doubt it is a sufficient policy. In this article, we 
present evidence suggesting that measuring student 
performance, setting performance standards, and 
threatening to sanction schools that fail to meet these 
standards are unlikely to close achievement gaps un- 
less accompanied by a restructuring of the financing 
of public education. It should be noted that although 
the new federal legislation represents the first time 
that all states will be required to test students on an 
annual basis, some states have been ad- 
ministering such tests for a number of 
years. Despite this testing and the 
publication of the results on an indi- 
vidual school basis, the performance 
of students in many schools, especially 
those serving disadvantaged children, 
remains substantially below average. 



We suggest that the amount of money 
necessary to meet student perfor- 
mance standards will vary across 
school districts. This variation in costs 
will be due primarily to factors over 
which local school officials have little 
control. For example, a school district 
with a high concentration of students from poor fami- 
lies or from families where English is not spoken in 
the home may have to use additional resources (in 
the form of smaller classes or specialized programs) 
to reach specified achievement goals. Also, some dis- 
tricts, given their location and the composition of 
their student bodies, will have to pay higher salaries 
than other districts to attract high-quality teachers. 

Requiring that all schools increase the academic per- 
formance of their students is a potentially important 
step in improving the quality of education in the 
United States. However, if cost differences among school 
districts are substantial, then imposing statewide stu- 
dent performance standards without simultaneously 
allocating more state financial aid to school districts 
with high costs may result in a situation in which 
school districts with above-average costs will not have 



enough resources to educate their students to meet 
the new standards. These schools may fail, not neces- 
sarily because of their own inability to effectively edu- 
cate children but because they have insufficient fiscal 
resources to do the job. 

Although the new federal education legislation does 
not explicitly address the connection between the cost 
of education and student performance, over the past 
decade the courts in a number of states have explicitly 
recognized the link between educational finance and 
student performance. In several states, the courts have 
declared state school financing systems unconstitu- 
tional because they have failed to provide all students, 
and especially those from economically disadvantaged 
families, with a sufficiently high- 
quality education; in the language 
favored by the courts, these systems 
have failed to provide an adequate 
education (Minorini and Sugarman 
1999). Prior to these recent court 
cases, the focus of most school fi- 
nance reform had been on resources 
alone. All states, to one degree or 
another, use state grants to school 
districts to partially equalize the fis- 
cal resources districts have available 
at a given rate of property taxation. 
In most states, grant formulas dis- 
tribute aid inversely to the size of 
each district’s per student property 
tax base but fail to account for differences in costs 
among school districts, differences that may contrib- 
ute to varying student performance. 

It should be emphasized that providing schools with 
enough resources to achieve state-imposed student 
performance goals will not guarantee that schools will 
actually use those resources effectively to improve stu- 
dent performance. However, once state governments 
have guaranteed that all school districts have sufficient 
financial resources to achieve state education goals, then 
the states can be aggressive in taking steps to inter- 
vene in those districts that fail to improve student per- 
formance. 

To determine the minimum amount of money a school 
district must spend to achieve a specified improvement 
in student performance, we estimate an educational 



Setting performance 
standards is unlikely 
to close achievement 
gaps unless accompa- 
nied by a restructuring 
of the financing of 
public education. 
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cost function using data from elementary and second- 
ary school districts in the state of Texas. Texas is a par- 
ticularly interesting state to study for two reasons. First, 
as is now well known, Texas has been administering 
annual student performance tests to its public school 
students since 1990 and using these tests as the basis 
for an accountability system that includes monetary 
rewards for schools and graduation requirements for 
individual students (Murnane and Levy 2001). Sec- 
ond, spurred by a series of court challenges to its school 
financing system, Texas’s system of state financial aid 
to local school districts takes explicit account of many 
of the factors that studies in other states have found to 
be systematically related to the costs of education. 

The rest of this article is divided into 
six sections. We start with a brief over- 
view of the school finance system in 
Texas. Then, in the following section, 
we derive our cost function and dis- 
cuss a set of estimation issues. The fol- 
lowing section presents the 
econometric results of our cost func- 
tion estimation. In the next two sec- 
tions, we address the question of how 
state aid formulas could be adjusted to 
account for differences in costs across 
school districts. We first discuss the cal- 
culation of a cost index that allows us 
to summarize the results of the esti- 
mation and then demonstrate how 
such an index can be used in a formula designed to guar- 
antee that every district would have sufficient fiscal re- 
sources to achieve any state-imposed student performance 
goals. In the final section, we draw some conclusions. 

School Finance in Texas 

During the 2001-02 academic year, public schools in 
Texas educated 4.1 million students. Of the $28 bil- 
lion of revenues raised for public education in 2001- 
02, 55 percent came from local sources, 42 percent 
from the state government, and 3 percent from the 
federal government (Texas Education Agency 2002). 
The state of Texas is divided into 1,045 school dis- 
tricts, with 968 providing K-12 education. The state 



uses a complex mechanism for distributing state aid 
to these districts. The major elements of the state aid 
system involve a $250 per student grant to each dis- 
trict; a foundation formula, with the $2,537 per pu- 
pil foundation level of spending adjusted for 
diseconomies of scale and for differences across dis- 
tricts in the cost of resources; and a guaranteed tax 
base formula for districts with property tax rates in 
excess of $8.60 for each $1,000 of property valuation 
and per pupil property tax bases below $258,100. 
Using a system of pupil weights, additional state aid 
is provided to school districts with concentrations of 
children from economically disadvantaged families and 
children eligible for various special education programs 
designated for those with disabilities and with limited 
proficiency in English. Finally, a 
unique element of the Texas system 
of school finance is that property- 
wealthy districts (those with more 
than $300,000 per weighted pupil) 
are required to “reduce their wealth 
to this level.” In most cases, this is 
accomplished by agreeing to educate 
students residing in other districts or 
by purchasing “attendance credits.”* 

Although there has been considerable 
debate among scholars concerning the 
magnitude of improvement in student 
performance, the Texas Education Au- 
thority has argued that student per- 
formance on state-administered exams has improved 
dramatically (Murnane and Eevy 2001). Nevertheless, 
data from testing done during the 2000-0 1 school year 
demonstrate that student performance in school dis- 
tricts with a high percentage of poor children and dis- 
tricts with a high percentage of minorities was 
substantially below average. For example, students in 
the 87 school districts where more than 75 percent of 
students came from poor households had composite test 
scores that were nearly one and a quarter standard de- 
viations below average. Average student performance in 
some Texas cities was even weaker. For example, the av- 
erage pass rate in San Antonio was 1.85 standard devia- 
tions below average and the average rate in Dallas was 
two and a quarter standard deviations below average. 



The Texas Education 
Authority has argued 
that student perfor- 
mance on state- 
administered exams 
has improved dramati- 
cally. 



' See Texas Education Agency (2002) for a full description of these provisions. 
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Cost Function Estimation 

Using data on per pupil school expenditures, student 
performance, and various characteristics of school dis- 
tricts, we estimate a cost function for K-12 public 
education in Texas. Estimating a cost function allows 
us to quantify the relationship between per pupil 
spending; student performance; various student char- 
acteristics; and the economic, educational, and social 
characteristics of school districts. We follow Bradford, 
Malt, and Oates (1969) in specifying the output of 
public schools (measured, for example, by student 
performance on standardized exams) as a function of 
school resources, such as teachers and textbooks, the 
characteristics of the student body, and the family and 
neighborhood environment in which 
the students live. This relationship is 
represented by equation (1), where S.^ 
represents an index of school output, 

X is a vector of direct school inputs, 

Z. is a vector of student characteris- 

It 

tics, and FAs a vector of family and 
neighborhood characteristics. The sub- 
script i refers to the school district and 
subscript t refers to the year. 

( 1 ) h = 

To move from this education produc- 
tion function to a cost function, a re- 
lationship between school inputs and 
educational spending is specified. This 
is shown in equation (2), where per pupil expendi- 
tures, are considered as a function of school in- 
puts, X, a vector of input prices, P.^ and £ .^ a vector 
of unobserved characteristics of the school district. 

( 2 ) 

The final step involves solving equation (1) for X^ and 
then plugging it into equation (2). This gives the cost 
function represented by equation (3), where u.^ is a 
random error term. 

(3) E. = h{,S , P., Z., F.,t , u.). 

Typically, equation (3) is assumed to be log linear 
and estimated with district-level data for a given state. 
In the next section, we present estimates of equation 
(3) using 1995-1996 data for K-12 school districts 



in Texas. The dependent variable is the log of per 
pupil expenditures (excluding spending on transpor- 
tation). The resulting coefficients indicate the con- 
tribution of various district characteristics to the cost 
of education, holding constant the level of output. 
There is some discussion in the literature on educa- 
tion production functions about the desirability of 
using school-level data (Hanushek, Rivkin, and Tay- 
lor 1996). We use district-level data here for two 
reasons: First, state aid in almost all states is distrib- 
uted to the district, and there is very little system- 
atic information on how money is spent at the school 
level. Second, several of the school and community 
variables that we include in our analysis are avail- 
able only at the district level. In the remainder of 
this section, we discuss a number 
of methodological and data issues 
that must be addressed to carry out 
the estimation. 

As pointed out by Duncombe and 
Yinger (1999), estimating cost func- 
tions provides a practical way to 
identify and quantify the factors that 
influence the costs of education, in 
which the output of school districts 
can be measured using multiple 
measures of school performance. Al- 
though student performance can, in 
principle, be measured in various 
ways, many states measure the ef- 
fectiveness of schools by relying on standardized ex- 
ams. For several years, Texas has been testing all students 
in grades 3 though 8 and in grade 10 in reading and 
math. The tests are administered in the spring of each 
year as part of the Texas Assessment of Academic Skills 
(TAAS). Considerable media attention is paid to the 
test score results, and improvements in average test 
scores (or lack thereof) are monitored closely. 

One of the ways in which this study differs from other 
cost function studies is in the use of a value-added 
measure of student performance in each school dis- 
trict. As a measure of school district output, we com- 
pare the average of the composite passing rate on the 
TAAS exams across grades 4 through 8 and in grade 
10 in 1995-1996 with the average passing rates in 
grades 3 through 7 of the same cohort of students in 
1994-1995 and the 8th-grade TAAS passing rate in 



This study differs from 
other cost function 
studies in the use of a 
value-added measure 
of student performance. 
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1993-1994 (to match the lOth-grade passing rate 
in 1995-1996).^ Robert Meyer (1996) provided a 
strong argument for using a value-added approach 
to isolate the contribution of school resources to in- 
creases in student achievement. He pointed out that 
the use of average scores from a single grade measures 
the average level of achievement prior to entering first 
grade, plus the average effects of school performance 
and of family, neighborhood, and student character- 
istics on the growth of student achievement from all 
years of previous schooling. It is thus likely that rather 
than providing a measure of the contribution of 
schools to the growth in student achievement, the 
single grade score primarily reflects the impact of fam- 
ily and neighborhood environment on student 
achievement. In addition, many of 
the recent policy proposals regard- 
ing standards, including those of the 
Bush administration, have focused 
on improvement in test scores from 
year to year. The value-added ap- 
proach is thus more useful for simu- 
lating the effects of actual policies. 



In addition to the TAAS scores, we 
also include student performance on 
the ACT exams as a measure of the 
quality of the preparation of students 
for higher education. Using scores on 
these exams as a measure of school 
quality can be problematic, however, 
because students decide whether to take the exam. 
Only students with a particular interest in continu- 
ing on to college will choose to take these exams, and 
these are presumably the “best” students, so their 
scores may reflect their own abilities and motivation 
rather than any influence of the school. By treating 
these scores as endogenous, we are able to control for 
this self-selection. As an instrument for ACT scores, 
we include the percentage of students who take a col- 
lege entrance exam. 



Educational output 
variables and per pupil 
expenditures are 
determined simulta- 
neously. 



Estimation of the cost function must take account of 
the fact that the educational output variables and 
per pupil expenditures are determined simulta- 
neously. That is, although local school board deci- 
sions to raise the level of student performance are 
expected to have direct implications for the level of 
spending, decisions concerning per student spend- 
ing are likely to influence student performance. To 
deal with this simultaneity, we estimate equation (3) 
using two-stage least squares, with the school output 
variables treated as endogenous. As instruments for 
these school output variables, we draw on a set of 
variables that are related to the demand for public 
education. Following a long literature on the deter- 
minants of local government spending, we model the 
demand for public education as a 
function of school district residents’ 
preferences for education, their in- 
comes, and the tax prices they face 
for education spending. To the ex- 
tent that the median voter model 
provides a reasonable framework for 
modeling school district spending 
decisions, it is appropriate to use 
median income and the tax price 
faced by the median voter as instru- 
ments.^ We also include as instru- 
ments two socioeconomic variables 
that may be related to the prefer- 
ences for public education: the per- 
centage of households with children 
and the percentage of household heads who are 
homeowners. 

For school input prices, we focus only on teacher sala- 
ries. Teachers are the single most important factor in 
the production of education, and not surprisingly, 
teacher salaries account for the largest share of school 
expenditures. It is important, however, to recognize 
that teacher payrolls are determined both by factors 
under the control of local school boards and factors 



^ Test scores represent the same students in the two academic years to the extent that interdistrict student mobility is relatively low. A 
recent study of elementary school students in Texas by Hanushek, Kain, and Rivkin (2001) found that roughly 86 percent of fourth- to 
seventh-graders remain in the same district from one year to the next. 

^ We use the tax price implied by Texas’s aid formula. 

As mentioned previously, we also include the proportion of students who take a college entrance exam as an instrument for ACT scores. 

^ In the results presented in the next section, the 1995-1996 Texas Assessment of Academic Skills (TAAS) scores are treated as endogenous, 
but the lagged scores are not. Hausman specification tests could not reject the null hypothesis that the lagged scores are exogenous. 
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that are largely outside of their control. In setting hir- 
ing policies, districts make decisions about the quality 
of teachers they wish to recruit. These decisions have 
obvious fiscal implications. For example, a district can 
limit its search for new teachers to those with advanced 
degrees, those with high grade point averages, or those 
with a certain number of courses in their teaching spe- 
cialty. Teacher salary levels are generally determined 
through a process of negotiation with teacher unions, 
and school boards have a substantial impact on the 
outcome of these negotiations. At the same time, the 
composition of the student body, working conditions 
within schools, and area cost of living play a poten- 
tially large role in determining the salary a school dis- 
trict must offer to attract teachers of any given quality. 
These factors will be reflected in stu- 
dent and district cost variables, to be 
described below. We would therefore 
like a measure of teacher salaries that 
reflects only salary differences that are 
outside the control of local school dis- 
tricts. One such measure is the teacher 
cost index developed by Jay Chambers 
(1995). Using 1990-1991 data from 
the National Center for Education Sta- 
tistics’ nationally representative Schools 
and Staffing Survey, Chambers esti- 
mated hedonic wage equations for 
teachers. He isolated those factors that 
are outside the control of the local 
school district (such as the racial com- 
position of the student body, local climate, crime rates, 
etc.) and used the coefficients for just those factors to 
construct a teacher salary index for each district in the 
country. By using this index as our measure of teacher 
salaries, differences across districts reflect true differ- 
ences in costs rather than differences in school board 
choices. 

The vectors of student, family, and neighborhood char- 
acteristics, Z and F., include several variables that 

It It 

influence a district’s level of spending per pupil. First, 
there is considerable evidence that there are higher 
costs associated with the education of children from 
low-income families. To measure the number of chil- 
dren from economically disadvantaged families, we use 
the percentage of students who qualify for the federal 



government-financed free and reduced-price lunch pro- 
gram or other public assistance. Second, there is a sub- 
stantial literature that documents the extra costs 
associated with educating students with various kinds 
of disabilities and students who enter the schools with 
limited knowledge of English. Therefore, we include 
the percentage of students who have been identified 
as limited English proficient and the percentages of 
students with two categories of disabilities: the per- 
centage of students who are classified as having any 
type of disability and the percentage of students who 
are classified as autistic, deaf, or blind. Third, to re- 
flect the possibility that more resources may be needed 
to provide a high school education as compared to an 
elementary school education, we also include the pro- 
portion of each school district’s stu- 
dent body that is enrolled in high 
school. Finally, to reflect potential 
diseconomies of scale associated 
with both small and large school 
districts, we include each district’s 
enrollment and enrollment squared. 
Summary statistics of all variables 
are presented in table 1. 

The variable £ in equation (3) rep- 
resents the unobserved factors in 
each school district that influence 
district spending. One such factor 
is the “inefficiency” of the district. 
That is, even after accounting for dif- 
ferences across school districts in cost factors, input 
prices, and student performance, some school districts 
will have higher levels of per pupil expenditures than 
other districts because those school districts are ineffi- 
cient. This could mean that they are inefficiently or- 
ganized or managed or that they use ineffective 
teaching techniques or employ a particularly ineffec- 
tive group of teachers. 

A number of recent articles have used complex statis- 
tical techniques to identify spending that is high rela- 
tive to spending in districts with similar student 
performance and costs.*’ Although great care must be 
taken in interpreting this extra spending as a measure 
of inefficiency, we include in our cost function estimates 
an efficiency index calculated using data envelopment 



Some school districts 
will have higher levels 
of per pupil expendi- 
tures than other dis- 
tricts because those 
school districts are 
inefficient. 



^ See, for example, Bessent and Bessent (1980); Deller and Rudnicki (1993); Duncombe, Ruggiero, and Yinger (1996); McCarty and 
Yaisawarng (1993); and Ruggiero (1996). 



40 




Financing Education So That No Child Is Left Behind: Determining the Costs of Improving Student Performance 



Table 1. Descriptive statistics for 803 Texas K-12 school districts: 1995-96 



Variable 


Mean 


Standard 

deviation 


Minimum 

value 


Maximum 

value 


Per pupil expenditures, 1 995-96, excluding transportation 
(in dollars) 


5,565 


1,039 


2,907 


11,444 


Composite TAAS pass rate, 1 995-96 


79.6 


8.3 


51.3 


96.2 


Composite lagged TAAS pass rate, 1 993-95 


75.5 


9.5 


37.1 


95.4 


Average ACT score 


19.9 


1.6 


15 


26 


Teacher salary index 


84.5 


9.2 


62.2 


107.5 


Percent of students eligible for free and reduced-price lunch and 
other public assistance 


44.4 


18.3 


0.1 


100.0 


Percent of students with disabilities 


13.4 


4.0 


0.7 


36.6 


Percent of students with severe disabilities 


0.21 


0.20 


0 


1.82 


Percent of students with limited English proficiency 


6.5 


10.5 


0 


72.5 


Percent of students enrolled in high school 


28.6 


3.2 


16.2 


49.9 


Student enrollment 


4,081.5 


8,614.1 


124 


74,772 


Efficiency index 


0.59 


0.12 


0.3 


1 


Tax price 


0.5 


0.3 


0 


1 


Percent of households with children 


34.8 


7.7 


16.2 


70 


Percent homeowners 


73.7 


10.3 


0.0 


99.2 


Median income (in dollars) 


23,814 


7,258 


8,196 


58,135 


Percent of students taking college entrance exams 


65.1 


14.6 


21.4 


100 



NOTE: TAAS is Texas Assessment of Academic Skiiis.The teacher saiary index is normalized around 1 for all school districts in the 
United States. As indicated in the table, the average district in Texas has salary costs that are 84.5 percent of the national average, 
Texas school districts with relatively high teacher costs as measured by the index have costs that are above the national average. 
The efficiency index takes a value of 1 for the most efficient districts. The data in the table show that the average district in Texas 
is 59 percent as efficient as the most efficient districts in the state. 

50URCE: Calculated by the authors from data from the Texas Education Agency. 



analysis.^ Data envelopment analysis is a nonparamet- 
ric estimation procedure that compares each district 
to a production frontier. Thus, after controlling for 
student performance and cost differences, lower spend- 
ing districts are considered to be operating with “best 
practices,” whereas any extra spending may be inter- 
preted as a measure of school district inefficiency. 

Cost Function Results 

To account for the large variance in district size in Texas, 
we weight the regressions by district enrollment and 
drop Dallas and Houston from the sample. Because of 
missing test scores, we were also forced to exclude 163 
of the 968 K-12 school districts. These excluded dis- 
tricts tend to be somewhat smaller, poorer, and higher 
spending than the 803 districts that remain in our 
sample and that provide the basis for the cost function 
estimation. 



Recall that we treat the school outcome variables as en- 
dogenous and estimate equation (3) using two-stage least 
squares.® The first two columns of table 2 present cost 
function results that include a measure of school district 
elFiciency, whereas the second two columns are estimated 
without that variable. The test scores have the expected 
signs; because lagged scores are a proxy for past levels of 
students’ achievement, high scores mean that districts 
can spend less to achieve a given level of educational 
progress. The cost variables generally have the expected 
signs, and many are statistically significant. Consistent 
with previous studies, we find a U-shaped relationship 
between spending per pupil and school district size; with 
our estimates, the bottom of the U is at roughly 22,026 
students when the efficiency measure is included and 
9,115 when it is not included. In contrast to the results 
of some other studies, we find that costs do not appear 
to be higher for high school students, although that vari- 
able is not statistically significant. 



^ See Duncombe, Ruggiero, and Yinger ( 1 996) for further discussion of the measurement of school district efficiency using data envelopment 
analysis. 

“ The detailed first-stage regression results can be found in the authors’ Public Finance Review article. 



41 




Developments in School Finance: 2003 



Table 2. Education cost function for 803 Texas K- 


-12 school districts: 1995 


-96 




Dependent variable: Log of expenditures per pupil 


Independent variables 


Coefficient 


t-statistic 


Coefficient 


t-statistic 


Intercept 


3.23** 


1.66 


-2.29 


-0.47 


Log of compositeTAAS pass rate, 1 995-96 


3.34* 


2.25 


7.29* 


2.02 


Log of lagged compositeTAAS pass rate, 1 993-95 


-2.53* 


-2.16 


-5.93* 


-2.04 


Log of average ACT score 


1.03* 


3.40 


1.76* 


2.44 


Teacher salary index 


0.0015** 


1.88 


0.0031* 


2.10 


Percent ofstudents eligible for free and reduced-price lunch 


0.12 


1.64 


0.57* 


2.92 


Percent of students with disabilities 


0.02 


0.12 


0.55 


1.42 


Percent of students with severe disabilities 


3.58 


1.03 


9.43 


1.29 


Percent of students with limited English proficiency 


0.41* 


3.86 


0.66* 


2.63 


Percent of students enrolled in high school 


-0.20 


-0.63 


0.20 


0.32 


Log of student enrollment 


-0.20* 


-4.25 


-0.31* 


-3.24 


Square of log of student enrollment 


0.01* 


3.95 


0.017* 


3.05 


Efficiency index 


-1.08* 


-9.91 


t 


t 


Sum of squared errors (SSE) 


8.571 






8.571 


t Not applicable. 

* Indicates statistically significant at the 5 percent level. 

** Indicates statistically significant at the 10 percent level. 
NOTE:TAAS is Texas Assessment of Academic Skills. 

SOURCE: Calculations by the authors. 



The differences in the cost functions with and with- 
out the efficiency measure highlight one of the draw- 
backs of the technique we use to measure efficiency. 
Our measure of efficiency captures the effect of all fac- 
tors that lead spending to be higher than the mini- 
mum cost of providing any given mix of public school 
output. Thus, school districts with above-average 
spending on things not measured by standardized tests 
(e.g., advanced music and arts courses) will be charac- 
terized as inefficient. Also, higher spending that is at- 
tributable to the higher costs of, for example, educating 
an above-average share of economically disadvantaged 
students will, in part, be characterized as “inefficiency.” 
As pointed out by Duncombe, Ruggiero, and Yinger 
(1996), the fact that these higher costs will be attrib- 
uted in part to the efficiency measure and in part to 
the cost factors explicitly included in the cost func- 
tion will mean that the cost function estimates with 
the efficiency measure will provide an underestimate 
of the full effects of the cost factors on education spend- 
ing. This could explain, for example, why the coeffi- 
cients on many of the cost factors increase when we do 
not include the efficiency measure. On the other hand, 
the coefficients in the model without the efficiency 
measure may be biased upward. We suspect that the 
“true” cost effects lie somewhere in between those in- 



dicated by the cost functions estimated with and with- 
out the efficiency adjustment. 

In summary, our estimated cost function suggests that 
in Texas, characteristics of school districts beyond the 
control of local school officials contribute to the amount 
of money needed to achieve any given level of student 
performance. This implies that equal per pupil spend- 
ing should not be expected to result in equal student 
performance gains in all districts. 

Cost Index Construction 

Estimating a cost function provides information about 
the contributions of various characteristics of school 
districts to the costs of education. The calculation of a 
cost index allows for the summarization of all the in- 
formation about costs into a single number for each 
district. For example, if we assume that the 
policymakers in a state define the minimum standard 
for an accountability system as the current average level 
of student performance, then a cost index can be con- 
structed that will indicate, for any given district, how 
much that district must spend, relative to the district 
with average costs, for its students to meet the state’s 
student performance standards. 
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To demonstrate the calculation of cost indexes, we set 
the TAAS scores and ACT scores at the average for all 
Texas districts. As discussed above, we use a value-added 
measure of student achievement in our cost function. 
Thus, the coefficient on 1995-96 scores reflects the 
increase in spending associated with an increase in stu- 
dent performance given an initial level of test score 
performance in 1994-95. In calculating the cost in- 
dex, we set the lagged score equal to the average as 
well; thus, our performance standard is not the aver- 
age level of student performance but the average gain 
in performance, that is, the average increase in the 
percentage of students passing the TAAS exams. The 
values of the cost factors are allowed to vary for each 
district, so we predict the level of spending required 
for each district to achieve this average gain, given their 
actual costs. 

We want to emphasize that alternative performance 
standards could be used to calculate the cost index; 
we use the average gain in scores here only as an ex- 
ample. The use of a different standard will not affect 
the relative ranking of districts in terms of their costs 
but will change their absolute cost index values and 
thus will influence any distribution of state aid that 
depends on the cost index. 

Using the cost function estimated without the effi- 
ciency index, we calculate that the school district in 



Texas with average costs (i.e., where each of the cost 
factors is set equal to its mean) must spend $5,610 
per pupil (in 1995-96) to reach our performance 
standard. For any given school district, the product 
of this number and its cost index (divided by 100) 
will indicate the minimum amount that district must 
spend to meet the student performance goal. Thus, 
for example, a Texas school district whose cost index 
is 125 will need to spend $7,012 per student 
($5,610 times 1.25) to reach the student perfor- 
mance standard. 

The first column of table 3 shows the variation in costs 
across K-12 school districts in Texas. The district with 
the lowest costs could achieve average performance by 
spending about two-thirds as much per pupil as the 
district with average costs. At the other extreme, the 
district with the highest costs must spend almost twice 
as much as the average cost district to provide an aver- 
age educational outcome for its students. The large 
range of the index reflects in part the values of the 
index in a few districts. Ignoring the 10 percent of 
districts with the lowest index values and the 10 per- 
cent of districts with the highest values substantially 
reduces the range of the cost index. The restricted range 
in table 3 shows that the district at the 10th percen- 
tile has costs that are about 20 percent below average 
cost and the district at the 90th percentile has costs 
that are 20 percent above average. 



Table 3. Distribution of education cost indexes for 803 Texas K-12 school districts: 1995-96 





Cost index with no 


Cost index with 






efficiency adjustment 


efficiency adjustment 


Texas index 


Mean 


100.0 


100.0 


100.0 


Median 


96.4 


98.4 


98.0 


Standard deviation 


17.7 


7.1 


14.9 


Range 


124.8 


49.8 


83.3 


Minimum 


67.1 


86.7 


75.1 


Maximum 


191.9 


136.5 


158.4 


Restricted range 


38.0 


14.8 


36.3 


Minimum at 1 0 percent 


82.1 


93.0 


83.3 


Maximum at 90 percent 


120.1 


107.9 


119.6 


Correlations: 








Cost index with no efficiency adjustment 


1.000 


t 


t 


Cost index with efficiency adjustment 


0.959 


1.000 


t 


Texas index 


0.513 


0.558 


1.000 



t Not applicable. 

SOURCE: Calculations by the authors. 
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When the estimated cost functions include no measure 
of efficiency, it is possible that we are interpreting extra 
spending that is caused by inefficiencies on the part of 
school districts as higher costs. When a measure of effi- 
ciency is included in the calculation of the cost index, 
the maximum cost index falls from 192 to 136. This 
suggests that the high cost index numbers for some dis- 
tricts may reflect in part some degree of inefficiency on 
the part of these local school districts. It is important to 
emphasize that even after adjusting cost indexes for in- 
efficiency, the variation in costs across districts remains 
substantial. The correlation between the indexes with 
and without the efficiency measure is 96 percent, sug- 
gesting that including a measure of efficiency has rela- 
tively little effect on the rank ordering of districts in 
terms of costs but can significantly re- 
duce the range. As mentioned previ- 
ously, however, one must take care in 
interpreting this difference as entirely 
attributable to inefficiency. 

The school finance system in Texas dis- 
tributes state aid to local school districts 
using several formulas that include a 
number of adjustments for cost differ- 
ences across districts. Although the for- 
mulas do not include a single cost index, 
they do include separate adjustments 
for cost-of-living differences, for 
diseconomies of scale in small and mid- 
size districts, and for the higher costs 
necessary to provide education to students from eco- 
nomically disadvantaged families, students with disabili- 
ties, and students with limited proficiency in English. 
Although the cost-of-living adjustments were developed 
from a careful empirical study, the origin of the other 
weights and adjustments is unclear. We suspect, how- 
ever, that the explicit and implicit weights given to each 
of these cost factors were determined as a result of com- 
plex political negotiations and thus are not likely to re- 
flect true cost differences. In contrast, the weighting of 
each cost factor in our cost index comes from the pa- 
rameter estimates of the cost function. If our cost func- 
tion is estimated correctly, these weights indicate the 
relative contribution of each cost factor to the overall 
costs of achieving a given student performance standard. 



To determine whether the current set of cost adjust- 
ments used in the distribution of state aid in Texas is 
compatible with reaching student performance stan- 
dards throughout the state, we compare our cost in- 
dex to an implicit index generated from the Texas aid 
program. To construct this index, we add together the 
basic foundation level (called the “basic allotment” and 
equal to $2,387 in 1995-96) and the total amount of 
each district’s special allotments reflecting all of the 
cost adjustments mentioned previously (for district 
size, for student disabilities, etc.). For each district, 
this sum is converted into an index number by divid- 
ing each sum by $3,453, the mean value of these sums 
across all districts.^ Summary statistics of the result- 
ing index, labeled Texas index, are shown in the third 
column in table 3. 

Although the range and the re- 
stricted range of the Texas index are 
between the ranges of the two vari- 
ants of our cost index, the simple 
correlations between our indexes and 
the implicit Texas index are relatively 
low — 0.558 and 0.513 for our in- 
dexes with and without the effi- 
ciency adjustment, respectively. As 
we shall demonstrate below, there are 
two important reasons why the in- 
dexes differ. First, our index is quite 
highly correlated with the percent- 
age of children from economically 
disadvantaged families, whereas the correlation between 
poverty and the implicit Texas index is much weaker. 
Second, although our cost function indicates that 
diseconomies of scale contribute to higher costs in small 
districts, the aid adjustments for small district size in 
the Texas aid formulas are much larger than the impor- 
tance of small size indicated by our cost functions. 

The Design of School Finance 
Formulas 

Foundation formulas are currently used by the major- 
ity of states to distribute state aid to local school dis- 
tricts. The formulas are designed so that each school 
district that uses a state-determined “minimum” prop- 



Diseconomies of scale 
contribute to higher 
costs in small districts. 



9 



In 1995-1996, the cost adjustments and weights used in the state aid formulas resulted in $1,066 in additional state aid (above the basic 
allotment of $2,387) in the average district. The sum of these two numbers equals $3,453. 
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erty tax rate will be able to achieve a “foundation” level 
of per pupil spending. If costs were identical in all 
school districts, then the state could guarantee that 
each school district had sufficient resources to achieve 
the state-specified minimum performance level by 
defining the foundation level as the spending neces- 
sary to produce that particular level of “output.” 

The results presented in the previous section indicate 
that costs (at least in Texas) differ substantially among 
school districts. Thus, to guarantee that all school dis- 
tricts within a state will have sufficient resources to meet 
state performance standards, it is necessary to develop a 
foundation formula where the foundation level of spend- 
ing varies according to differences in costs across dis- 
tricts and where the average foundation 
level equals the dollar amount neces- 
sary to meet the performance standards 
in districts with average costs. 



Costs (at least in Texas) 
differ substantially 
among school districts. 



A conventional foundation aid for- 
mula is presented in equation (4), 
where A. equals the foundation aid 
per pupil in district i, E* is the foun- 
dation level of per pupil spending, t* 
the mandated local property tax rate, 
and V. the property value per pupil 
in school district i\ 

(4) A. = MAX{T^ - fV., 0}. 



To adapt a foundation formula so that it will guaran- 
tee that every district has sufficient resources to meet 
the state’s performance standards in our example, mea- 
sured as the average gain in test scores, requires a de- 
termination of the amount of money school districts 
with average costs need to meet the state standard. 
Referring to this standard as S*, E can be defined as 
the amount a school district with average costs must 
spend to meet the standard. A foundation formula de- 
signed to guarantee that every school district has suf- 
ficient resources to achieve S* can be written as 

(5) A^ = MAX{ Ac. - 0}, 

where c. is the value of the cost index in school district 
To demonstrate the use of this formula using Texas 



data, we define E as the expenditure needed to achieve 
the average ACT scores and average TAAS performance 
gain in a district with average costs. The amount of 
aid allocated to district i using this cost- adjusted foun- 
dation aid formula will be a function of the per pupil 
property wealth in district i and the relative costs in 
district i. We have chosen as the foundation level of 
per pupil spending $5,610, the amount needed to 
achieve the average performance in a district with av- 
erage costs. Our choice for the required property tax 
rate (t*) is 8.6 mills (or 0.86 percent), which was the 
actual required mill rate for the first tier of the Texas 
foundation program in 1995-96. 

The El Paso school district can be used to provide an 
example of the operation of the cost- 
adjusted foundation formula. In 
1995-1996, El Paso had more than 
64,000 students, 80 percent of 
whom were non-White, and two- 
thirds of whom were from poor fami- 
lies. El Paso’s cost index was 29 
percent above average when the in- 
dex was calculated without an effi- 
ciency measure and 12 percent above 
average when the efficiency measure 
was included. These index values im- 
ply that to achieve the average gain 
in student achievement, El Paso will 
need to spend between 12 percent 
and 29 percent more than the dis- 
trict with average costs. State aid could provide these 
funds by establishing a cost- adjusted foundation level 
for El Paso between $6,283 (1.12 x $5,610) and 
$7,237 (1.29 X $5,610). 

As discussed in the previous section, various cost fac- 
tors and pupil weights influence the distribution of 
state aid in Texas. To focus on how the distribution of 
state aid would change by replacing the existing 
weights and adjustments with ones that are derived 
from our estimated cost function, we conducted sev- 
eral simulations of Texas school aid using a cost- 
adjusted foundation formula with alternative cost ad- 
justments. The first two columns of table 4 summa- 
rize the distribution of cost-adjusted foundation aid 
using our cost index, without and with the efficiency 



10 



See Ladd and Yinger (1994) for a detailed derivation of a cost-adjusted foundation formula. 
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Table 4. Distribution of aid per pupil for 800 Texas K-12 school districts under alternative cost- 
adjusted foundation formulas: 1995-96 





Cost index 
(with no efficiency 
measure) 


Cost index 
(including efficiency 
measure) 


Cost-adjusted 
(with Texas 
index) 


Percent difference between 
average of cost index formulas 
andTexas index formula 


Mean 


$4,170 


$4,167 


$4,154 


0.3% 


Standard deviation 


1,580 


1,200 


1,325 


4.9% 


Minimum 


0 


0 


0 


0 


Maximum 


10,560 


7,400 


7,941 


13.1% 


Total aid (in billions of dollars) 


$13.7 


$13.7 


$11.9 


15.1% 


District size quintiles 


1 (smallest) 


$4,236 


$4,202 


$4,385 


-3.8% 


2 


3,737 


3,961 


3,409 


1 2.9% 


3 


4,575 


4,328 


3,725 


19.5% 


4 


3,900 


4,038 


3,415 


1 6.2% 


5 (largest) 


4,534 


4,280 


3,564 


23.6% 


Percent poor quintiles 


1 (fewest poor) 


$2,977 


$3,652 


$3,431 


-3.4% 


2 


3,557 


3,891 


3,953 


-5.8% 


3 


3,857 


3,943 


3,970 


-1.7% 


4 


4,587 


4,350 


4,475 


-0.1% 


5 (most poor) 


6,407 


5,307 


5,109 


14.6% 



SOURCE: Calculations by the authors. 



measure. The next column of table 4 shows the distri- 
bution of aid using the Texas index. Recall that the 
Texas index reflects the pupil weights and other cost 
adjustments used in the actual distribution of state 
aid in academic year 1995-96. All three simulations 
use the same foundation level (A) and required tax rate 
it*). 

The simulation results show that including an effi- 
ciency measure in the cost function used to construct 
the index has little effect on the size of the average 
grant but substantially reduces the magnitude of the 
largest grant." Distributing grants using the cost- 
adjusted formulas based on our cost index would have 
required an aid budget of $13.7 billion. Distributing 
aid using the Texas index would require an aid budget 
of only $11.9 billion because the Texas index would 
distribute the largest per pupil grants to the smallest 
school districts. 

The differences in the pattern of aid distribution can 
be seen most clearly in the bottom two panels of table 



4. In the middle panel, we have divided school dis- 
tricts into pupil-weighted quintiles by district size. 
Each quintile thus includes approximately the same 
number of students but a different number of districts. 
Included in the first quintile are the 595 school dis- 
tricts with enrollment below 3,320, whereas the fifth 
quintile contains just 11 school districts, all of which 
have enrollments in excess of 43,550. The data show 
clearly that the Texas school aid formula allocates more 
aid to small school districts and considerably less aid 
to large school districts than would a foundation for- 
mula based on cost adjustments derived from our esti- 
mated cost functions. Comparing the average aid 
allocation from our two cost index simulations (with 
and without the efficiency measure) with the aid allo- 
cation from the Texas index simulation, aid would in- 
crease by nearly 25 percent in the largest district-size 
quintile, whereas aid would be reduced by about 4 
percent in the smallest size quintile. 

The data in the bottommost panel of table 4 divide 
school districts into pupil-weighted quintiles by the 



" Ranking per pupil grants by size, the grant at the 90th percentile is $580 larger when the efficiency measure is not included in the cost 
index calculation compared to the 90th percentile grant when the efficiency measure is included. 
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percentage of district enrollment that is poor. Although 
all three simulations generate larger per pupil grants 
to school districts with concentrations of poor pupils, 
comparing the aid distributions indicates that our cost 
functions imply a higher weight on concentrated pov- 
erty than the weight given to poverty in the actual 
Texas school aid formulas. On average, the two cost 
index simulations generate a 15 percent larger per pu- 
pil grant in the highest poverty quintile than the grant 
generated by the pupil weights used in the existing 
state aid formulas. 

By definition, the first two simulations in table 4 are 
designed to distribute state aid in such a way that 
every school district would be provided with an 
amount of revenue sufficient to en- 
able them to achieve at least the cur- 
rent state average gain in TAAS 
scores. Our cost function results 
clearly indicate that improving stu- 
dent performance requires additional 
resources. It is thus not surprising 
that the $13.7 billion budget for 
implementing either one of the cost- 
adjusted aid foundation formulas 
will be greater than the amount the 
state actually spent on school aid. 

In fact, for the 1995-96 academic 
year, the state distributed $6.8 bil- 
lion in state aid to the 800 school 
districts included in the simula- 
tions.'^ This implies that to provide all school dis- 
tricts with sufficient revenues to achieve average gains 
in TAAS scores would have required a doubling of 
state aid (actually a 101 percent increase in aid). To 
put this increase in state aid in context, Texas in 1995- 
96 was a relatively low-spending state. At $5,473, it 
ranked 33rd in expenditures per pupil (Snyder 
1999). In addition, at 42.9 percent, the state 
government’s share of total education revenue was be- 
low the national average. The state share of educa- 
tion funding was higher in 31 other states. If the 
state government had increased aid to local school 
districts by 85 percent, the state share would have 
risen to 60.2 percent of total revenue, a share that 
would have still been lower than the state share in 
1 1 other states. 



Conclusions 

Policy debates are raging about how student perfor- 
mance should be measured, the type of tests that 
should be used, and the appropriate role testing 
should play. Despite strong disagreement concern- 
ing the answers to these issues, there appears to be 
a growing consensus that measuring student aca- 
demic performance is absolutely necessary if the 
quality of education provided to many of the 
nation’s poor children is to improve. In this article, 
we have argued that if states are going to require 
their students to meet these more rigorous educa- 
tional goals, they must recognize that achieving these 
goals will require more resources in some school dis- 
tricts than in other districts for rea- 
sons that are outside the control of 
local school officials. This implies 
that a necessary, though not suffi- 
cient, condition for achieving any 
given performance goal is that state 
fiscal assistance to local school dis- 
tricts account explicitly for differ- 
ences in costs across districts within 
a state. We have demonstrated that 
a cost-adjusted foundation formula 
can be an effective instrument for 
this purpose. 

We use data from Texas to show that 
it is possible to measure cost differ- 
ences across districts and that these 
cost differences are large. We then demonstrate the 
use of cost-adjusted foundation formulas as a mecha- 
nism for distributing state aid in a way that will en- 
hance the chances that a state can meet its student 
performance goals. In Texas, where cost considerations 
already play a major role in the distribution of state 
aid to local school districts, we conclude that reform- 
ing the existing state aid formulas to provide a heavier 
weight to children from economically disadvantaged 
families and a lower weight to small, mainly rural, dis- 
tricts would better align the distribution of fiscal re- 
sources with the underlying costs of education. 

It is important to note that the debates over education 
standards center around two different educational 



There appears to be a 
consensus that measur- 
ing student academic 
performance is abso- 
lutely necessary if the 
quality of education 
provided to many of 
the nation's poor 
children is to improve. 



Because of missing data, three school districts had to be dropped before conducting the aid simulations reported in table 4. 
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goals. In this article, we have focused on the goal of 
annual improvement in student performance. But a 
second goal involves bringing all students (or groups 
of students, characterized by race, gender, or location) 
up to a target performance level. Policies in a number 
of states requiring graduation tests and prohibiting so- 
cial promotions are examples of absolute student per- 
formance standards. The total cost of achieving any 
absolute standard in a particular school or school dis- 
trict depends in large part on the size of the achieve- 
ment gap between the current level of student 
achievement and the standard. 

Because the level of student achieve- 
ment in a number of Texas school dis- 
tricts is substantially below the average 
level of achievement, it is not surpris- 
ing that these districts will require a 
substantial infusion of new resources 
if they are to close the achievement 
gap. We recalculated our cost index so 
that it indicated the cost, relative to 
the district with average costs, of reach- 
ing the statewide average level of stu- 
dent achievement on the TAAS.'^ The 
range of the resulting indexes increased 
substantially; the school district with 
the highest costs had an index value of 
718 with no efficiency adjustment, and 220 with the 
efficiency adjustment. To implement a cost-adjusted 
foundation formula that would guarantee each school 
district enough money to reach the average student 
performance level would require a substantial increase 
in the size of the state aid budget — to $21.2 billion 
without the efficiency measure and to $16.9 billion 
with the efficiency measure. 

In this article, we have demonstrated that the costs of 
achieving a gains-based standard (in our example, the 



statewide average annual gain in test scores) will vary 
substantially across school districts. To ensure that all 
school districts have adequate resources to sustain an- 
nual student performance gains, districts with higher 
costs will have to be guaranteed additional state fiscal 
assistance. If annual achievement gains can be main- 
tained, then, over time, low-performing school dis- 
tricts will be able to meet absolute student achievement 
goals. Obviously, school districts with the lowest lev- 
els of current student performance will take the long- 
est time to reach state-imposed standards. 

One of the most contentious issues 
in the debate surrounding the re- 
authorization of the Elementary and 
Secondary Education Act was 
whether low-performing schools 
should be required to achieve state- 
imposed performance standards 
within a fixed number of years. Our 
estimated cost functions could be in- 
terpreted as suggesting that if a school 
district with high costs is provided 
with sufficient additional funds, it 
could fully offset the disadvantages 
of higher costs in a single year. 

We believe that this implication is 
not justified. In fact, school districts with large achieve- 
ment gaps will, under most circumstances, take more 
time to reach any specified state standard than dis- 
tricts with smaller gaps. From a purely statistical point 
of view, using our estimated cost function to reach 
conclusions about the money needed to close any 
given achievement gap within a year generally requires 
extrapolation beyond the data.''* More important, in 
recent years there have been substantial advances in 
the development of teaching techniques that are ef- 
fective in improving the academic performance of low- 



To guarantee each 
school district enough 
money to reach the 
average student per- 
formance level would 
require a substantial 
increase in the size of 
the state aid budget. 



” In the estimation of the cost function, lagged student performance is treated as an endogenous variable because, as with current 
performance, it is, in part, a choice of the district. In creating the cost index, we want to abstract away from any variation that is under 
the control of the district. Thus, to account for the endogeneity of the lagged scores, we calculate the cost index using predicted lagged 
scores, with the predictions based on the coefficient estimates from the first-stage regression, actual values of the cost factors, and state 
average values for the demand instruments. That is, a district’s predicted lagged score reflects the score expected from a district with 
average preferences and observed cost factors. Put together with the average 1995-1996 score, the level of spending predicted by the cost 
function is the spending required to reach average achievement given average tastes for education and actual cost factors. 

If a high-cost district has a cost index value of 300, this implies that this district will need to spend three times as much per pupil as the 
district with average costs for its students to reach the student performance standard, say average performance, on the TAAS. Although 
this conclusion may be correct, assuming that it can be achieved within a single year requires that we use our estimated cost function to 
extrapolate beyond our data; that is, there are no school districts that achieve average student performance while spending three times the 
spending level in the district with average costs. 
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achieving students. Experts on learning recognize, how- 
ever, that the processing of new knowledge, informa- 
tion, and concepts takes time (Bransford, Brown, and 
Cocking 2000). Although there is only limited research 
on the time needed to acquire and process knowledge, 
it is probably unrealistic to expect that students who 
are currently performing at substantially below grade 
level can catch up within a single year even if addi- 
tional resources are devoted to their education. 

Although providing additional financial aid to school 
districts with large achievement gaps is a crucial step 
toward reducing those gaps, it is also likely that in 
many school districts, a large, sudden increase in state 
aid would not be used effectively to increase student 
performance (Duncombe and Yinger 2000). Provid- 
ing new money to schools and school districts with 
above-average costs, if it is phased in over a period of 
years, is likely to be most effective in increasing stu- 
dent performance. 

Specifying a fixed time limit within which state per- 
formance standards must be met and imposing sanc- 



tions on those districts failing to meet the deadline is 
likely to penalize school districts that are currently 
performing at low levels, even if these districts suc- 
ceed in making adequate annual progress in improv- 
ing their students’ test scores. Such a policy could 
lead to discouragement instead of improved achieve- 
ment. Because an important role of higher student 
performance standards is to create incentives for 
schools, teachers, and students to increase the amount 
of learning that occurs, such standards must be set at 
reasonable levels. 

There are still many issues to resolve in how educa- 
tional costs and school outputs are measured and in 
how to reform policy to account for these costs, but it 
is clear that improving the educational performance of 
all students requires the annual measurement of stu- 
dent performance, the setting of reasonable goals, and 
the allocation of state and federal aid to school dis- 
tricts in a way that recognizes differences among school 
districts both in fiscal capacities and in the costs of 
providing education. 
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Introduction 

For over a decade, perhaps no other issue in educa- 
tion has generated the same level of debate and policy 
activity as school accountability. At their most basic, 
accountability policies tie school rewards and sanc- 
tions to measures of school performance, typically 
specified as either performance levels (for example, 
aggregate percentile ranks or the percentage of stu- 
dents meeting specified benchmarks) or changes in 
performance (for example, increases in aggregate test 
scores or in the percentage of students meeting 



benchmarks). While most accountability efforts have 
been enacted at the state and local level, the peak of 
this movement may be the federal No Child Left 
Behind (NCLB) Act of 2001, which requires states 
to demonstrate adequate yearly progress in reading 
and mathematics performance by school and by sub- 
groups within schools. Common to these reform ef- 
forts is the underlying notion that incentives based 
upon measures of school performance will spur im- 
provements in student performance. 

Given the popularity of accountability reforms around 
the country, it is not surprising that considerable at- 
tention is being paid to evaluating the impact of these 
reforms, both intended and unintended, and assess- 
ing the incentives and disincentives embedded in those 
reforms. (See, for example, Cullen and Reback 2002; 
Figlio and Winicki 2002; Figlio 2003). In contrast, 
relatively little attention has been paid to identifying 
and specifying valid and reliable measures of school 
performance, even though performance measurement 
lies at the heart of these reforms. Developing appro- 
priate methods is clearly necessary — though not suffi- 
cient — to creating and implementing accountability 
systems that function as policymakers intend. This 
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paper examines alternative methods of measuring 
school performance, considering both practical and 
conceptual issues and evaluating the relative merits of 
these different measures in applications using data on 
schools in New York City and Ohio. 

Although largely overlooked in the implementation of 
most accountability reforms, one of the most difficult 
challenges to be overcome is the difficulty of compar- 
ing schools that educate diverse and differing student 
populations and work with different levels of resources 
and institutional constraints. Put simply, it has been 
well established since at least the time of the Coleman 
report (1966) that variations in student performance, 
particularly as measured by standardized test scores, 
are highly correlated with students’ 
socioeconomic backgrounds. On av- 
erage, students from higher socioeco- 
nomic backgrounds do better. Thus, 
performance measures that fail to ac- 
count for these student differences 
risk rewarding schools whose task is 
in many ways “easier” because of the 
out-of-school advantages that stu- 
dents from wealthier households tend 
to enjoy, and because of the poten- 
tial peer effects generated when the 
distribution of these students is clus- 
tered in certain schools. Likewise, 
schools serving primarily students 
without these advantages may appear 
to be low performing due largely to factors outside 
their control. Similarly, schools in many urban and 
rural areas may be labeled low performing in part be- 
cause they lack the necessary resources to meet the 
performance levels of schools in wealthier areas. 

Of course, while most accountability systems focus on 
the level of performance (say, on standardized tests), 
school efficiency (in using resources to produce de- 
sired outcomes) is perhaps even more crucial in today’s 
constrained fiscal environment. While the sets of high- 
performing and highly efficient schools will often over- 
lap, schools with generous resources may be able to 
achieve high performance without optimizing the use 
of their resources. As state and local budgets bind tightly 
enough that shortening the school week is seriously con- 
sidered (Reid 2002), finding schools that make the most 
effective use of their limited resources (and learning how 
they do so) becomes increasingly important. 



Variations in student 
performance are highly 
correlated with stu- 
dents' socioeconomic 
backgrounds. 



In this paper, we describe, analyze, and compare four 
alternative methods of measuring school performance 
and efficiency. Using data from schools in one state 
(Ohio) and one large city (New York City), we use 
these different methods to estimate the relative effi- 
ciency of public schools and to explore the similari- 
ties and differences between the results obtained. To 
be specific, we explore the extent to which the re- 
sulting efficiency measures differ from one another, 
particularly in the identification of “good” and “bad” 
schools. Further, we analyze how and why the meth- 
ods differ, both in their theoretical underpinnings 
and in their practical applications, and highlight 
strengths and weaknesses of each method. The re- 
mainder of this paper proceeds as follows. The next 
section describes previous work on 
the measurement of school perfor- 
mance and efficiency, including con- 
ceptual and empirical issues raised 
by the research. This is followed by 
brief overviews of the data and the 
four techniques employed in this 
paper; adjusted performance mea- 
sures (APMs), data envelopment 
analysis (DEA), educational produc- 
tion functions (EPFs), and cost func- 
tions. A final section presents the 
results of analyses using the four 
methods with similar data sets, com- 
pares the results, and presents con- 
clusions on the use of these methods 
for school performance and efficiency measurement. 

Background and Literature 

In recent years, a relatively small body of research 
has begun to accumulate that considers conceptual 
and empirical issues raised by efforts to measure 
school performance, specifically in the design and 
implementation of accountability systems. For ex- 
ample, Flanushek and Raymond (2002) and Ladd 
(2002) review a number of questions that such poli- 
cies raise: Do the performance measures reflect the 
material taught? Is the performance of all students, 
teachers, and administrators taken into account? 
What are the most appropriate target scores or rates 
of increase? What incentives and disincentives are 
embedded in the system? What data are needed and 
how do errors in the data affect the performance 
assessment? While many researchers acknowledge 
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that comparing schools based on average test scores 
may be unfair because schools may be held respon- 
sible for factors (like student background) over which 
they have no control, deciding which factors should 
be controlled is also a matter of considerable con- 
troversy. For example, there may be general agree- 
ment that schools serving high proportions of 
students from low-income families face greater chal- 
lenges in generating a high level of student achieve- 
ment, but there is less agreement that it is 
appropriate to control for race (Clotfelter and Ladd 
1996; Ladd 2002; Ladd and Walsh 2002). Problems 
may arise when using changes in test scores as out- 
put measures because two cohorts of students within 
the same school may differ in background and cu- 
mulative school inputs, thereby pro- 
ducing biased measures (Hanushek 
and Raymond 2002; Linn and Haug 
2002; Ladd and Walsh 2002). Mo- 
bility, exemptions from testing, and 
measurement error may also lead to 
poor measures (see Clotfelter and Ladd 
[1996], Hanushek and Raymond 
[2002], Ladd [2002], and Ladd and 
Walsh [2002] on mobility; Hanushek 
and Raymond [2002], and Kane and 
Staiger [2002], on exemptions from 
testing; and Hanushek and Raymond 
[2002], Ladd [2002], Ladd and 
Walsh [2002], and Linn and Haug 
[2002] on measurement error). 

Clotfelter and Ladd (1996) observe performance in- 
centive programs in Dallas and South Carolina to de- 
termine whether they can and do affect performance. 
The authors look at various measures of school perfor- 
mance, some based on changes, others on residuals. 
Overall, they find that measures based on changes in 
scores are highly correlated and those based on residu- 
als are correlated, but correlations between changes and 
residuals are much lower. Ladd and Walsh (2002) as- 
sess the value-added approaches used to measure school 
effectiveness in South Carolina and North Carolina. 
Both states employ simple value-added measures based 
upon data on current and past student test scores, but 
do not include adjustments for family background and 
school-level resources. Using data for 1993-95 on the 
reading and mathematics fifth-grade test scores of more 
than 37,000 North Carolina students, they investi- 
gate the impact of specification and measurement er- 



rors on school rankings and find that the ranking of 
schools is somewhat sensitive to specification — rankings 
derived from fixed effects regressions differ from those 
based upon the mean of residuals over the years of the 
data. More importantly, using an instrumental vari- 
able approach, they find that measurement error bias 
is responsible for about two-fifths of the higher perfor- 
mance of schools with more advantaged populations 
and correcting for measurement error causes dramatic 
changes in the relative rankings of schools. 

Kane and Staiger (2002) describe the statistical prop- 
erties of a variety of performance measures using the 
performance of fourth-graders taking a mathematics 
test in 1,163 elementary schools in North Carolina 
in 1998. They also look at changes 
between 1998 and 1999 and be- 
tween 1994 and 1999. The authors 
argue that accountability systems are 
often based on imprecisely measured 
test scores, usually for one year, and 
that using (weighted) averages of test 
scores over a few years would be more 
reliable. By measuring the correla- 
tion between changes from one year 
to the next and seeing whether 
changes one year are reversed the 
following year, the authors estimate 
that at least three-quarters of the 
variance in test scores is transitory 
and that small schools, in particu- 
lar, are more apt to witness such transitory changes. 
The authors suggest grouping schools by size and 
distributing rewards/sanctions within each group or 
giving smaller awards to more schools. 

Linn and Haug (2002) confirm Kane and Staiger’s find- 
ings. Using fourth-grade reading data on Colorado 
schools for 1997-2000, the authors find that schools 
with high percentages of high- achieving students have 
smaller gains than other schools. They also find that 
schools that experience a gain (or loss) between the 
first two years generally observe a loss (or gain) be- 
tween the next two years. Thus, schools that experi- 
ence a gain between two years may not have better 
educational practices than others do. And if a school 
experiences a loss and receives assistance, this assistance 
may not be responsible for the gain between the next 
two years. The authors suggest including information 
on reliability in accountability reports. 



Schools that experience 
a gain between two 
years may not have 
better educational 
practices than others 
do. 
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Overview of Four Methods 

Our work builds upon previous research to examine 
the properties of four quantitative techniques that may 
be used to measure school performance: adjusted per- 
formance measures, education production functions, 
data envelopment analysis, and cost functions. Each 
of these methods allows some way of accounting for 
differences in the inputs to the educational process 
across schools — primarily the characteristics of stu- 
dents and resources available to school personnel. Each 
attempts to measure school efficiency; that is, the 
school’s contribution to producing outputs with a 
given mix of students and resources, and each relies 
upon test scores to measure output. The methods dif- 
fer in their theoretical underpinnings, their data re- 
quirements, their ability to include 
multiple outputs simultaneously, 
and in the information they provide 
regarding potential sources of ineffi- 
ciency. A unique contribution of this 
paper is that we estimate efficiency 
scores using each of the methods 
based upon the same data, allowing 
us to isolate the differences between 
the methods (and the information 
they provide) from differences be- 
tween data sets and variable defini- 
tions. In addition, we use real rather 
than simulated data, which permit 
us to explore the many practical is- 
sues that arise in using administra- 
tive data for these purposes. The next 
sections review the data sets we use and then briefly 
discuss each of the quantitative techniques. 

Data 

For our New York City analyses, we employ a rich 
school-level database that includes information on stu- 
dent characteristics, test scores, and school resources. 
The DEA and APM analyses use data for the 1999- 
2000 school year only, while the EPFs and cost func- 
tion analyses use data on a balanced panel of 602 
elementary schools for 1995-96 through 2000-2001. 
(A balanced panel includes multiple years of data on 
the same schools.) The panel includes only schools with 



third, fourth, and fifth grades and valid reading and 
mathematics scores for each grade in each year. In ad- 
dition to school-level aggregates, grade-level demo- 
graphic variables (race/ethnicity, immigrant status, free 
and reduced-price lunch eligibility) were calculated 
from student-level data.' 

All test score data are reported as standardized z-scores. 
Data for the third and fifth grades come from the CTB/ 
McGraw Hill Test of Basic Skills (CTB) in reading 
and the California Achievement Test (CAT) in math- 
ematics, while fourth-grade data for 1998-99 and 
1999—2000 are from New York State English Lan- 
guage Arts (ELA) reading and mathematics tests. For 
comparability, the tests are normalized to New York 
City-wide averages.^ 

In Ohio, the DEA and APMs use data 
for the 1997-98 school year while the 
EPFs and cost functions use a panel of 
783 schools that include both fourth 
and sixth grades during a 4-year pe- 
riod, 1995-96 through 1998-99. 
Passage rates on fourth- and sixth-grade 
writing and mathematics proficiency 
exams were used to capture outputs. 

Table 1 displays descriptive statistics 
for both the Ohio and New York City 
school samples. In developing the 
databases, we made every effort to use 
identical sets of schools and variables 
for each method. Our data sets and variables lists are 
very similar for each method, though not identical for 
a variety of reasons, which are discussed more fully in 
the “Data Challenges” section below. 

Adjusted Performance Measures (APMs) 

APMs use a regression-based technique in which an 
output, typically some type of test score measure, is 
regressed on a set of variables thought to represent fac- 
tors outside the control of the school itself These ex- 
ogenous factors often include student and school 
characteristics, typically measured as school-level — or 
perhaps grade-level — aggregates. The APM is, then, 
each school’s estimated residual value, or the differ- 



A unique contribution 
of this paper is that we 
estimate efficiency 
scores using each of 
the methods based 
upon the same data. 



' For greater detail on the data, see Schwartz and Zabel (2003) and Schwartz, Stiefel, and Bel Had) Amor (2003). 
^ Greater detail on the normalizing procedure is available in Stiefel, Schwartz, Bel Hadj Amor, and Kim (2003). 
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Table 1. Descriptive statistics for New York City and Ohio schools: 1997-2000 




Mean 


Standard deviation 


New York City schools, 1 999-2000 






Z-score, mean fifth-grade mathematics 


0.04 


0.47 


Z-score, mean fifth-grade reading 


0.03 


0.42 


Lagged z-score, mean fourth-grade mathematics 


.03 


0.50 


Lagged z-score, mean fourth-grade reading 


.03 


0.46 


Grade 5, percent free iunch eiigibie 


73.23 


24.94 


Grade 5, percent reduced-price iunch eiigibie 


6.86 


5.52 


Grade 5, percent Biack 


36.32 


32.56 


Grade 5, percent Hispanic 


33.94 


26.05 


Grade 5, percent Asian 


11.39 


16.32 


Grade 5, percent with Language Assessment Battery (LAB) score iess than 40th percentile 


7.58 


7.59 


Teacher-pupii ratio 


0.08 


0.01 


Non-ciassroom teacher expenditures (in dollars) 


5,653 


1,376 


Percent of teachers with greater than 2 years' experience in same school 


64.11 


13.87 


Percent of teachers with master's degree 


77.27 


13.64 


Percent of teachers with greater than 5 years' experience 


57.66 


13.44 


Percent of teachers permanently licensed or assigned 


81.70 


15.00 


Enrollment 


802.86 


339.70 


Ohio schools, 1997-98 






Percent passing, sixth-grade mathematics 


48.48 


20.88 


Percent passing, sixth-grade reading 


52.60 


18.40 


2-year lagged percent passing, fourth-grade mathematics 


43.38 


20.54 


2-year lagged percent passing, fourth-grade reading 


44.10 


18.19 


Percent free or reduced-price lunch eligible 


28.33 


24.44 


Percent Black 


13.41 


25.96 


Percent Asian 


2.15 


5.04 


Instructional expenditures per pupil (in dollars) 


3,465 


606 


Noninstructional expenditures per pupil (in dollars) 


1,795 


490 


Enrollment 


426.58 


178.74 


SOURCE: Authors' calculations based upon data provided by the New York City Board of Education, the Ohio Department of 


Education, and Ohio Education Association. 







ence between the actual school output and the output 
predicted from the regression equation (see Stiefel, 
Schwartz, Bel Hadj Amor, and Kim [2003]; 
Rubenstein, Schwartz, and Stiefel [2003]; and Stiefel, 
Rubenstein, and Schwartz [1999] for more on APMs). 

Prior-year test scores may be included as independent 
variables in order to approximate the school’s “value added” 
to student achievement over the course of the school year. 
An alternative approach is to measure the dependent vari- 
able as the change in test score between years. Resources 
within the control of the school may also be included in 
the equation to minimize bias from omitted variables. If 
such variables are included, they should be set to the 
sample mean rather than the observed value for each school 
when calculating the APM. This approach can be used 
to predict performance given observed factors outside the 



control of the school and average controllable resources 
(see Rubenstein, Schwartz, and Stiefel 2003). 

While the APM procedure is the most straightforward 
and “user-friendly” of the techniques discussed here, 
it is important to note that APMs implicitly assume 
that all of the estimated error reflects relative school 
efficiency or inefficiency. To the extent that the error 
term captures other factors, such as measurement er- 
ror or the effects of unobserved or omitted variables, 
the residuals may under- or overestimate school effi- 
ciency. Another potential drawback is that, like most 
regression-based techniques, APMs can be calculated 
for only one output measure at a time. This does not, 
however, preclude the analyst from creating a compos- 
ite measure combining multiple APMs, perhaps stan- 
dardizing the residuals if the measurement scales differ. 
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Data Envelopment Analysis (DEA) 

DEA is a non-stochastic technique for assessing rela- 
tive technical efficiency across organizations.^ More 
specifically, it requires the construction of a nonpara- 
metric efficiency frontier based on the observed in- 
put/output ratios of units in the sample, such that all 
of the efficient decisionmaking units (DMUs) lie on 
the frontier and “envelop” the inefficient units lying 
off the frontier. The efficient units receive an efficiency 
score of 1 (or 100) while each inefficient unit’s effi- 
ciency is calculated as one minus its distance from 
the efficiency frontier. Thus, lower scores indicate lower 
levels of efficiency. The DEA concept differs from re- 
gression-based techniques in several ways. First, DEA 
assesses efficiency in relation to the 
best results actually achieved by units 
in the sample, rather than the aver- 
age results achieved. Second, DEA can 
include both multiple inputs and 
multiple outputs. Finally, the DEA 
procedure seeks to maximize each 
unit’s efficiency rating by assigning 
unit-specific weights in the linear 
program. Therefore, units can achieve 
efficiency through specialization as 
well as through high performance 
across multiple measures. 

DEA is not a statistical technique and 
it does not produce coefficients that 
can be used for testing significance 
or inferring from this sample to populations. Unlike a 
more standard production function, though, DEA does 
not assume that the functional form is the same across 
schools. The frontier is constructed from the observed 
inputs and outputs of the units in the sample (Charnes 
et. al 1994). 

A clear advantage of DEA for assessing school efficiency 
is that it can explicitly account for schools’ multiple 
outputs. However, since schools can reach the frontier 
by specializing in certain areas, some schools may be 
deemed efficient despite low performance in some ar- 
eas. The DEA technique also permits inputs to be la- 
beled as “controllable” or “uncontrollable” to school 
personnel. And while regression-based techniques may 
provide little guidance about ways to improve effi- 



ciency, DEA produces “slack” values suggesting the re- 
duction in inputs that would be possible without 
harming outputs. 

Education Production Functions (EPFs) 

Education production functions link outputs and in- 
puts in a relationship that can be written to include a 
specific term to capture the persistent efficiency of a 
school’s production process over a period of years. Given 
panel data on a set of schools for a number of years, 
efficiency is directly measured and not inferred from 
the error term, as in the APM. Instead, a “fixed effects 
model” can be estimated using Ordinary Least Squares 
(OLS) regression in which the “school effect’ captures 
the impact of the time-invariant char- 
acteristics of the school on the output 
measure, conditional on the observed 
differences in school inputs and stu- 
dent characteristics. The output may 
be specified as the change in test scores 
across grades between years, the change 
within the same grade between years, 
or the level of the test score with a 
lagged-year test score included as a 
right-hand-side “input.” The model 
may also include a grade effect to cap- 
ture grade-specific phenomena as well 
as nonlinear relationships. This for- 
mulation is further developed in 
Schwartz and Zabel (2003). 

As with the APM, EPFs permit only one output mea- 
sure in each equation. Multiple grades or subject areas 
could, however, be combined in a variety of ways, either 
as a single composite output measure or by combining 
school fixed effects from multiple equations. In the EPF, 
the schools with the largest estimated school effects would 
be considered the most efficient or “best” schools. Un- 
like an APM, the fixed effects specification allows the 
analyst to disentangle the effect of unchanging school 
characteristics from random error. However, the school 
fixed effect may still largely be a “black box,” capturing 
all the unmeasured school characteristics affecting per- 
formance but offering little guidance as to what those 
characteristics might be. Another alternative is to 
“purge” the estimated fixed effects of time-invariant 



The schools with the 
largest estimated 
school effects would be 
considered the most 
efficient or “best" 
schools. 



^ Extensions of the model can also account for allocative efficiency, but the basic model does not. 
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characteristics, such as location, by running a second 
regression. This is explored in Schwartz and Zabel (2003). 

Cost Functions 

School cost functions are the analogs of production func- 
tions, and rest upon the same underlying theoretical 
foundation. While a production function captures the 
relationship between inputs and outputs directly, the 
corresponding cost function captures the minimum cost 
of producing a given level of output, conditional on the 
prices of inputs. In principle, the dependent variable in 
a cost function is the cost incurred in producing educa- 
tion, and the independent variables are outputs, input 
prices, and other cost factors. In practice, expenditures 
are used as the dependent variable in 
cost functions and, as described be- 
low, data on input prices are limited. 

Cost functions offer several concep- 
tual advantages over EPFs for mea- 
suring school efficiency. First, while 
input quantities may not be exog- 
enous to schools, input prices are 
more likely to be exogenous at the 
school level. This is important to in- 
terpreting the results — OFS regres- 
sions are correctly specified and have 
the usual interpretation only if the 
independent variables are exogenous. 

Otherwise, coefficients can be biased, 
confounding interpretation. Second, like DEA, cost 
functions can include multiple outputs simultaneously 
since they are independent variables in the model. To 
the extent that input prices are influenced by school 
activities, though, treating them as exogenous variables 
may not be appropriate.^ From a practical standpoint, 
a more difficult problem is that good data on input 
prices may be difficult to obtain, particularly at the 
school level. 



Empirical Analyses and Comparisons 
of Results 

In this section we assess the reliability and consis- 
tency of school performance measures across quanti- 
tative techniques.^ As described above, we constructed 
data sets for use in each analysis with an eye toward 
consistency across analyses. While it was not possible 
to construct identical data sets because of differing 
data needs, each analysis uses largely the same schools 
and same variables as inputs and outputs. We begin 
with a discussion of issues confronted in assembling 
the data required for estimating efficiency using the 
different methods. 

Data Challenges 

Each of the four methods requires 
somewhat different data — in variables 
and in the amount of data (number of 
years) — and imposes somewhat differ- 
ent limitations on the use of data, 
implying that slightly different 
samples will be required. To begin, the 
methods differ in the treatment of 
missing data. A basic requirement of 
all of these methods is that, while some 
are able to accommodate missing val- 
ues in independent variables, in all 
cases observations with missing data 
for the dependent variable must be 
omitted. Thus, samples may differ 
because of differences in the dependent variable used. 
APM- or EPF-based measures require complete data on 
the test score specified as the dependent variable; cost- 
function-based measures require complete data on costs. 
Rather than restrict our analyses only to those schools 
for which a full set of data was available with no missing 
values, which may be unrepresentative of the whole, we 
allowed for slightly different samples.'" 



The cost function 
captures the minimum 
cost of producing a 
given level of output. 



If, for example, low-performing schools are authorized to pay teachers higher salaries, then the salaries are not appropriately viewed as 
exogenous, complicating the estimation of cost functions. 

^ Papers by Stiefel et al. (2003), Schwartz and Zabel (2003), Rubenstein (2003), and Schwartz, Stiefel, and Bel Had) Amor (2003) discuss 
specific issues raised by the analyses using APMs, EPFs, DEA, and cost functions, respectively, and some of these results are also 
summarized in the conclusions to this paper. 

^ For APMs and production and cost functions, missing values in independent variables can be dealt with by interpolation or, as in the 
studies underlying this paper, by “re-coding” the variables: a new variable is used in the regressions which equals the original variable, if 
it is not missing, or zero if it is missing. In addition, a dummy variable coded one if the value is reported and zero otherwise is included 
in the models. The coefficient on this variable, if significant, indicates that the value of the dependent variable varies systematically 
between the group of schools that have the data and the group of schools that do not. 
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In this light, APMs and production functions on the 
one hand complement cost functions on the other; 
since test scores are independent variables in a cost 
function, schools with missing data on one or more 
test scores remain in the sample and efficiency mea- 
sures can be computed for them. Likewise, it is pos- 
sible to construct efficiency measures for schools that 
do not have expenditure data by using APMs or pro- 
duction functions. 

Dealing with missing data in DEA is more problem- 
atic. Observations with missing data cannot be included 
in DEA, however, and schools that have missing data 
on any of the variables used are not used for estima- 
tion. This again raises an issue of internal validity. In 
addition, DEA cannot accommodate 
variables with zero or negative values. 

Instead, variables may be recoded: a 
zero is replaced with a very small 
number (to be defined on a case-by- 
case basis). Similarly, the DEA pro- 
cedure assumes that increases in 
inputs lead to increases in outputs. 

Thus, if an input is negatively corre- 
lated with an output, it must be re- 
specified to have a positive 
correlation. For example, rather than 
using the percentage of students who 
are receiving free lunch as a measure 
of poverty, the percentage of students 
who are not receiving free lunch is 
created and included. The latter implies making as- 
sumptions as to the relationship between the input 
and the output. While it is generally accepted that 
the relationship between free lunch eligibility and 
school performance is negative, other relationships, say 
between school performance and the performance of 
females, are less obvious and must be determined em- 
pirically. In practice, assumptions are made based on 
theory and correlations among the relevant variables. 
Care must be taken when interpreting the results, how- 
ever, because a variable can be coded differently in dif- 
ferent samples or when using different techniques (such 
that the percentage of students who are female may be 
used in one sample and the percentage of students 
who are male used in another). 

As noted earlier, the methods differ in the number of 
years of data required for estimating efficiency. APMs 



and DEA are cross-sectional methods, which imply 
that only one year of data is necessary or must be 
chosen for estimation. While the most current year 
of data may be used, in an effort to better reflect 
current conditions, the choice may be based upon 
the availability of data, if, for example, some vari- 
ables are available on an irregular basis. If the goal is 
to compare across methods and/or locations, the most 
current year of data for all methods and/or places 
may be appropriate. Efficiency measures can be esti- 
mated using these methods for several years for com- 
parison across years. 

While production and cost functions may be estimated 
with a single year of data, estimating efficiency mea- 
sures using school fixed effects, as de- 
scribed above, requires a panel data 
set. The implication is important: data 
must be available for a school for at 
least two years (and more specifically, 
they must have data on the depen- 
dent variable for at least two years), 
so that new schools must be excluded. 
In truth, this may be consistent with 
policy objectives, for example, to give 
new schools one or two “experimen- 
tal” years before they are held ac- 
countable for student performance. 
But it may also blunt the incentives 
for schools to be efficient if the schools 
are not eligible for rewards or sanc- 
tions. More generally, these methods may be some- 
what less useful for jurisdictions in which there are a 
considerable number of school reorganizations, open- 
ings, closings, etc., and analysts must make hard deci- 
sions about when a school is “new” and when a school 
is more appropriately treated as persisting, even if it is 
somewhat changed. 

Notice, also, that the methods differ in the variables 
necessary for estimation — and, while some of these 
variables are relatively common, others are quite scarce. 
While the minimum data requirements to use the 
APM method are relatively easy to meet, the data re- 
quired to estimate cost functions are rarely available. 
Even where school-level data on expenditures are avail- 
able (and ignoring the potential distinction between 
cost and expenditure data), data on input prices are 
scarce. Data on teacher salaries, or salary schedules, is 



The methods differ in 
the number of years of 
data required for 
estimating efficiency. 
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crucial for estimating cost functions, yet these are fre- 
quently unavailable or, as in our New York City 
sample, salaries may not vary across the sample of 
schools, making it impossible to include them in a 
regression equation/ One alternative, perhaps not 
fully satisfying, is to include teacher characteristics as 
proxies as we do in our New York City analyses be- 
low. In this case, however, the resulting efficiency 
measures are closer to “adjusted cost measures” — bear- 
ing the same relationship to efficiency measures based 
on cost functions as APMs bear to efficiency measures 
based on EPFs. In what may be viewed as a “best case 
scenario,” salary data may be available, as in our Ohio 
sample, but since teacher contracts are typically ne- 
gotiated at the district level rather than at the school 
level, salaries are unlikely to vary across schools within 
districts. Thus, estimating a true cost function re- 
quires, at the very least, data that span a significant 
number of school districts. 

Finally, estimating efficiency measures using any of 
these techniques requires defining a sample of schools 
over which variables and the estimated parameters are 
likely to be consistent. Is it plausible that the “tech- 
nology” of producing education is the same in elemen- 
tary schools and high schools? If not, then it is 
inappropriate to estimate the efficiency of these schools 
as a group — the coefficients of the education produc- 
tion function would be mis-estimated, as would the 
parameters of the cost functions. While it is clearly 
difficult to justify comparing, say, elementary and high 



schools, more subtle choices must be made. As an ex- 
ample, is it appropriate to compare the efficiency of 
all the schools that serve a sixth grade — using a single 
production function, APM equation, or cost func- 
tion — even though some are elementary schools serv- 
ing kindergarten and other early childhood grades, 
while others are middle schools serving higher grades? 
Should we only compare schools having the same grade 
spans? Doing so would further restrict sample size and 
make it difficult to form samples of sufficient size. Ide- 
ally, a compromise can be found in which all schools 
in the sample are similar along a number of lines and 
the sample size is large enough to obtain reliable re- 
sults — with other differences controlled through the 
regressions. 

Empirical Comparisons 

Table 2 displays the Pearson correlation coefficients across 
the four methods for the Ohio data. Note that two of 
the methods — ^APMs and EPFs — have separate reading 
and mathematics results because, unlike DEA and cost 
functions, they use only one output measure at a time. 
The raw correlations among methods, even as different 
as DEA and APMs or EPFs, are often above 0.5. In fact, 
only the cost functions exhibit correlations so low with 
any other method as to be indistinguishable from zero. 
Raw measures for the same grade and same test over 
years, or different tests in the same year, are often corre- 
lated above 0.90. Still, these correlations in table 2, 
across different methods, are quite high. 



This is a minimum requirement, ideaiiy, data on other input prices shouid be inciuded, but those are not commoniy reported. 



Table 2. Ohio Pearson correlation coefficients across four quantitative techniques: 1995-99 





EPF 

(reading) 


EPF 

(mathematics) 


DEA 


APM 

(reading) 


APM 

(mathematics) 


Cost function 


EPF (reading) 


1.000 


0.573*** 


0.509*** 


0.628*** 


0.385*** 


-0.045 


EPF (mathematics) 


0.573*** 


1.000 


0.505*** 


0.381*** 


0.679*** 


-0.026 


DEA 


0.509*** 


0.505*** 


1.000 


0.639*** 


0.639*** 


-0.051 


APM (reading) 


0.628*** 


0.381*** 


0.639*** 


1.000 


0.498*** 


-0.039 


APM (mathematics) 


0.385*** 


0.679*** 


0.639*** 


0.498*** 


1.000 


0.006 


Cost function 


-0.045 


-0.026 


-0.051 


-0.039 


0.006 


1.000 



*** Indicates significance at the 1 percent level. 

NOTE: Data for DEA and APMs are from 1997-98; EPFs and cost functions use a balanced panel, 1995-96 through 1998-99. 

SOURCE: Authors' calculations based upon data provided by the New York City Board of Education, the Ohio Department of 
Education, and Ohio Education Association. 
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The results for New York City, displayed in table 3, 
are somewhat more variable than those for Ohio. While 
some correlations are higher than those in Ohio (be- 
tween the reading and mathematics EPFs or between 
DEA and APMs), many are lower. Once again, the 
cost function efficiency measures show zero correla- 
tion with any other measures. 

These correlations may raise more questions than they 
answer. For example, do correlations in these ranges 
mean that if a jurisdiction tried to categorize schools 
into several groups — one indicating successful schools, 
another indicating failing schools, and a third in the 
middle — the use of alternative methods would lead to 
large differences in the schools in the successful and fail- 
ing groups? As another example, what are the charac- 
teristics of the schools that shift groups across methods? 
And finally, what are the key differences among these 
methods that should lead a jurisdiction to choose one 
or the other depending on its objectives? Tables 4 and 5 
summarize the results from an array of cross-tabulations 
based on school rankings from selected methods.® The 
schools are divided into the top group (schools whose 
efficiency is above the 90th percentile), the bottom group 
(schools whose efficiency is below the 10th percentile) 
and the middle group (schools whose efficiency ranges 
from the 10th to the 90th percentiles). In each cell, a 
series of numbers indicates the percentage of the schools 
that move, or do not move, from one of the percentile 



A complete set of cross-tabulations is available from the authors. 



groups to another when the method listed in the col- 
umn, rather the method listed in the row, is used. 

More specifically, D2 (“down two”) is the percentage 
of schools that move from the top to the bottom rank- 
ing group. The top left cell in table 4, for example, 
indicates that only 0.1 percent of the schools that are 
at the top based on the reading EPF are at the bottom 
based on the mathematics EPF The D group (“down”) 
combines two groups of schools: (1) the schools that 
move from the top to the middle group and (2) the 
schools that move from the middle to the bottom 
group. The same cell indicates that 11.4 percent of the 
schools move from top to middle or middle to bottom 
when switching from the reading EPF to the math- 
ematics EPF C (“constant”) is the percentage of schools 
that are ranked in the same percentile group according 
to both methods, and is, therefore, the combination of 
three groups: (1) the schools that are in the top accord- 
ing to both methods, (2) the schools that are in the 
middle according to both methods, and (3) the schools 
that are at the bottom according to both methods. The 
top left cell in table 4 indicates that the vast majority of 
schools (76.8 percent) are ranked in the same percen- 
tile group by the reading and mathematics EPFs. The 
U group (“up”) combines two groups of schools: (1) 
the schools that move from bottom to middle and (2) 
the schools that move from middle to top. As shown in 
the top left cell of table 4, 11.7 percent of the schools 



Table 3. New York City Pearson correlation coefficients across four quantitative techniques: 
1995-2001 





EPF 

(reading) 


EPF 

(mathematics) 


DEA 


APM 

(reading) 


APM 

(mathematics) 


Cost function 


EPF (reading) 


1.000 


0.888*** 


0.168*** 


0.374*** 


0.310*** 


0.037 


EPF (mathematics) 


0.888*** 


1 .0000 


0.157*** 


0.331*** 


0.456*** 


0.086** 


DEA 


0.168*** 


0.157*** 


1.000 


0.073* 


0.087** 


-0.061 


APM (reading) 


0.374*** 


0.331*** 


0.073* 


1.000 


0.585*** 


-0.035 


APM (mathematics) 


0.310*** 


0.456*** 


0.087** 


0.585*** 


1.000 


0.062 


Cost function 


0.037 


0.086** 


-0.061 


-0.035 


0.062 


1.000 



* Indicates significance at the 10 percent level. 

** Indicates significance at the 5 percent level. 

Indicates significance at the 1 percent level. 

NOTE: Data for DEA and APMs are from 1999-2000; EPFs and cost functions use a balanced panel, 1995-96 through 2000-01. 

SOURCE: Authors' calculations based upon data provided by the New York City Board of Education, the Ohio Department of 
Education, and Ohio Education Association. 
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Table 4. Comparison of percentile rankings by quantitative technique, Ohio schools: 1995-99 





EPF (mathematics) 


DEA 


APM (reading) 


APM (mathematics) 


Cost function 


EPF 


D2 = 0.1 


D2 = 0.0 


D2 = 0.0 


D2 = 0.1 


D2 = 0.8 


(reading) 


D = 1 1 .4 


D= 13.4 


D = 11.9 


D= 14.4 


D= 16.0 




C = 76.8 


C = 69.6 


C = 76.3 


C = 70.9 


C = 66.6 




U = 1 1 .7 


U = 16.3 


U = 1 1 .7 


U = 14.4 


U = 15.7 




U2 = 0.0 


U2 = 0.8 


U2 = 0.1 


U2 = 0.1 


U2 = 0.9 


EPF 


t 


D2 = 0.0 


D2 = 0.1 


D2 = 0.0 


D2 = 0.8 


(mathematics) 




D= 12.5 


D = 14.8 


D = 10.8 


D = 14.6 






C = 71.7 


C = 70.4 


C = 78.6 


C = 69.1 






U = 14.8 


U = 14.3 


U = 10.5 


U = 15.1 






U2= 1.1 


U2 = 0.4 


U2 = 0.1 


U2 = 0.5 


DEA 


t 


t 


D2 = 0.0 


D2 = 0.4 


D2 = 0.7 








D= 16.0 


D= 14.4 


D = 1 5.6 








C = 72.5 


C = 74.4 


C = 64.1 








U = 1 1 .5 


U = 10.8 


U = 15.1 








U2 = 0.0 


U2 = 0.0 


U2 = 0.9 


APM 


t 


t 


t 


D2 = 0.3 


D2 = 1.1 


(reading) 








D= 12.7 


D= 15.1 










C = 74.1 


C = 68.0 










U = 12.7 


U = 14.6 










U2 = 0.3 


U2= 1.3 


APM 


t 


t 


t 


t 


D2 = 0.7 



(mathematics) D = 1 5.6 

C = 67.8 
U = 15.1 
U2 = 0.9 



t Not applicable. 

NOTE: The schools are divided into the top group (schools whose efficiency is above the 90th percentile), the bottom group 
(schools whose efficiency is below the 10th percentile) and the middle group (schools whose efficiency ranges from the 10th to 
the 90th percentiles). In each cell, a series of numbers indicates the percentage of the schools that move, or do not move, from 
one of the percentile groups to another when the method listed in the column is used rather than the method listed in the row. 
D2 is the percentage of schools that move from top to bottom. D combines the schools that move from top to middle and the 
schools that move from middle to bottom. C is the percentage of schools that are ranked in the same percentile group according 
to both methods. U combines the schools that move from bottom to middle and the schools that move from middle to top. U2 
designates the percentage of schools that move from bottom to top. Data for DEA and APMs are from 1997-98; EPFs and cost 
functions use a balanced panel, 1995-96 through 1998-99. 

SOURCE: Authors' calculations based upon data provided by the New York City Board of Education, the Ohio Department of 
Education, and Ohio Education Association. 



are in these two categories. None of the schools in this 
cell are in the U2 group (“up two”), that is, none of the 
schools move from the bottom to the top. 

Table 4 presents results for Ohio and table 5 presents 
results for New York City. The least consistent ranking 
comparisons are those comparing the New York City 
DEA results to the other methods. As shown in table 5, 
a high proportion of schools are awarded a higher rank 
based upon the DEA measures than they are awarded 



based upon other measures. For example, the second cell 
in row 1 of table 5 indicates that 48.8 percent of schools 
are ranked one group higher using DEA as compared to 
the reading EPF and 5.3 percent of schools are ranked 
two groups higher. This pattern is the result of a large 
proportion of New York City schools being rated as fully 
elFicient using DEA. Thus, since many schools earn 100 
percent efficiency scores (tying for first place), the 
highest percentile grouping in the New York City DEA 
models actually includes more than 10 percent of schools.^ 



9 



Other specifications of the DEA model produce lower proportions of schools rated efficient (see Rubenstein 2003). For comparative 
purposes, the DEA specification presented in this paper uses the same combination of inputs and outputs as the other techniques. 
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Table 5. Comparison of percentile rankings by quantitative technique, New York City schools: 
1995-2001 





EPF (mathematics) 


DEA 


APM (reading) 


APM (mathematics) 


Cost function 


EPF 


D2 = 0.0 


D2 = 0.2 


D2 = 0.0 


D2 = 0.0 


D2 = 1 .0 


(reading) 


D = 6.1 


D= 10.8 


D= 14.5 


D = 14.8 


D = 15.5 




C = 87.7 


C = 34.9 


C = 71.4 


C = 70.4 


C = 66.6 




U = 6.1 


U = 48.8 


U = 13.8 


U = 14.8 


U = 16.4 




U2 = 0.0 


U2 = 5.3 


U2 = 0.3 


U2 = 0.0 


U2 = 0.5 


EPF 


t 


D2 = 0.5 


D2 = 0.2 


D2 = 0.0 


D2 = 1 .2 


(mathematics) 




D = 9.1 


D = 14.6 


D = 13.5 


D = 14.6 






C = 37.0 


C = 70.6 


C = 73.1 


C = 67.8 






U =48.2 


U = 14.3 


U = 13.5 


U = 1 6.0 






U2 = 5.2 


U2 = 0.3 


U2 = 0.0 


U2 = 0.5 


DEA 


t 


t 


D2 = 4.8 


D2 = 5.8 


D2 = 6.8 








D = 50.5 


D = 48.8 


D = 46.9 








C = 33.7 


C = 33.7 


C = 34.9 








U = 10.1 


U = 11.1 


U = 10.8 








U2 = 0.8 


U2 = 0.5 


U2 = 0.7 


APM 


t 


t 


t 


D2 = 0.0 


D2 = 1 .5 


(reading) 








D = 1 1 .6 


D = 14.8 










C = 76.9 


C = 66.9 










U = 1 1 .3 


U = 15.8 










U2 = 0.2 


U2 = 1.0 


APM 


t 


t 


t 


t 


D2 = 0.5 



(mathematics) D = 1 6.5 

C = 66.8 
U = 15.1 
U2 = 1.2 



t Not applicable. 

NOTE: The schools are divided into the top group (schools whose efficiency is above the 90th percentile), the bottom group 
(schools whose efficiency is below the 10th percentile) and the middle group (schools whose efficiency ranges from the 10th to 
the 90th percentiles). In each cell, a series of numbers indicates the percentage of the schools that move, or do not move, from 
one of the percentile groups to another when the method listed in the column is used rather than the method listed in the row. 
D2 is the percentage of schools that move from top to bottom. D combines the schools that move from top to middle and the 
schools that move from middle to bottom. C is the percentage of schools that are ranked in the same percentile group according 
to both methods. U combines the schools that move from bottom to middle and the schools that move from middle to top. U2 
designates the percentage of schools that move from bottom to top. Data for DEA and APMs are from 1 999-2000; EPFs and cost 
functions use a balanced panel, 1995-96 through 2000-01. 

SOURCE: Authors' calculations based upon data provided by the New York City Board of Education, the Ohio Department of 
Education, and Ohio Education Association. 



If these methods are used to distribute rewards and 
sanctions to schools, it may be particularly distressing 
to find that schools are labeled as among the highest 
performers using one method or output measure and 
among the lowest performers with another. Overall, as 
a broad sweep, the results in tables 4 and 5 are some- 
what surprising as they show relatively little move- 
ment between the top and bottom groups across 
methods, even ones as uncorrelated as cost functions 
and EPFs or ones as unrelated conceptually and em- 
pirically as EPFs and DEA. In general, the DEA mea- 



sures show the most movement — again an interesting 
result because, while they are not empirically the least 
correlated with other methods, they are, arguably, the 
least related conceptually. That is, an EPF and a cost 
function are theoretically the inverse of one another, 
and an APM is an “atheoretical” EPF. So the EPFs, 
cost measures, and APMs are highly related in theory. 
But DEA, while an “input-output” type model, dif- 
fers in its ability to choose frontier schools that excel 
in only one output. 
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As noted above, these comparisons raise the question, 
what are the characteristics of the schools that shift 
groups across methods? While providing a satisfying 
answer to this question is outside the scope of this pa- 
per, we compare the characteristics of the set of schools 
remaining in the same ranking category across two 
methods with those that changed categories. The re- 
sults were intriguing, if only suggestive of questions for 
further investigation. As an example, in the New York 
City analyses, a series of pair-wise comparisons between 
the rankings based upon the different methods found 
that in 13 out of 15 of these comparisons, the schools 
with constant rankings were larger than the schools that 
had shifted categories, either up or down, and exhib- 
ited lower expenditures per pupil; in 14 out of 15 com- 
parisons, the consistently ranked 
schools had a higher share of licensed 
teachers, experienced teachers, and 
teachers with master’s degrees; and in 
all cases, the consistent schools had 
more teachers with at least 2 years in 
the school than did the schools that 
changed categories. Similarly, in Ohio, 
we found that in 13 out of 15 pair- 
wise comparisons, the consistently 
ranked schools were larger. Other dif- 
ferences were less dramatic, though. 

These simple comparisons appear to 
support findings from other work (for 
example, Kane and Staiger 2002) in- 
dicating that performance measures for 
small schools may be particularly susceptible to mea- 
surement error and random events. The other com- 
parisons also suggest the need for further work to 
investigate the circumstances under which schools are 
persistently rated as high or low performing. 

Conclusions 

The four methods of school efficiency measurement 
we examine use different methodological approaches, 
but all are related conceptually by their connection to 
economic output/input theory. That is, each method 
implicitly treats schools as “firms” that convert a vari- 
ety of inputs (resources, employees, students, etc.) into 
an array of outputs (typically some measure of stu- 
dent performance on tests, though measures such as 
graduation rates, attendance, and social outcomes could 
be added or substituted). The characterization of 
schools as “firms” does not imply that schools are, or 



should be, factory-like organizations or profit-making 
entities. It does, however, imply that we must try to 
identify the most effective strategies for accomplish- 
ing the most we can with increasingly scarce resources. 

Using similar school-level data sets for two jurisdic- 
tions provides a unique opportunity to examine the 
characteristics of several different methods of school 
performance measurement, and to compare the sta- 
bility of results using these multiple methods. The 
comparisons of efficiency rankings from these tech- 
niques indicate that the efficiency scores and effi- 
ciency rankings are moderately consistent, generally 
producing midrange Pearson and Spearman rank cor- 
relation coefficients. This result suggests that cau- 
tion may be warranted in using these 
methods to distinguish subtle differ- 
ences in school performance, how- 
ever. While others have described 
large inconsistencies in school 
rankings across grades and subject 
matter exams (Kane and Staiger 
2002) and across specifications 
(Clotfelter and Ladd 1996), our re- 
sults show that different analytic 
methods may also produce different 
results, even when using largely the 
same data and specifications. While it 
may not be altogether surprising 
that methods using panel data pro- 
duce different results from the 
methods using cross-sectional data (due to the dif- 
ferent data used), our results indicate that the two 
methods using panel data (EPFs and cost functions) 
tend to produce very different results from each other 
in both samples. In the New York City sample, the 
methods using cross-sectional data also produced low 
correlations, while in the Ohio data they were rela- 
tively high (over 0.60). 

The results also suggest that the various methods are 
unlikely to produce vastly different lists of the highest 
and lowest performing schools. If the purpose of the 
analysis is to identify consistently high-performing and 
low-performing schools, perhaps to study best prac- 
tices or choose candidates for intervention, then the 
use of these multiple methods may provide a more 
reliable approach than the use of a single method. Our 
analyses suggest that these outlier listings will not 
change substantially across techniques. If the purpose 



Different analytic 
methods may also 
produce different 
results, even when 
using largely the same 
data and specifications. 
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of the analysis is to provide detailed relative rankings 
of schools, however, our analyses suggest that the re- 
sults may be too sensitive to the quantitative techniques 
employed to produce reliable rankings throughout the 
distribution of schools. 

Our analyses also highlight some of the potential ben- 
efits and drawbacks of each of the four methods: 

■ Adjusted performance measures (APMs). APMs are 
the most tractable of the four methods, impos- 
ing the least onerous data and analysis require- 
ments. Their reliance on single output measures, 
though, requires consensus on the most appro- 
priate measure — or composite of measures — to 
use for the analysis. Because of 
their relative ease of estimation, 
they may be the best suited of 
the four methods for construc- 
tion of annual performance 
measures or reports. 

■ Education production functions 
(EPFs). While EPFs have con- 
siderably larger data require- 
ments than APMs, they may be 
more effective for identifying 
persistent, rather than random 
or transitory, differences across 
schools. In school systems with 
relatively stable groups of 
schools, the EPF procedure may 
be a feasible approach for identifying consistent 
performance differences. At the same time, they 
may be of limited use in dynamic, rapidly chang- 
ing school systems. Like APMs, they raise issues 
regarding the appropriate selection of output 
measures and grade levels. 

■ Data envelopment analysis (DEA). DEA has the 
distinct advantage of allowing multiple outputs 
and inputs simultaneously, permitting schools to 
focus on particular strengths. This type of special- 
ization may not, however, be generally accepted 
by educators or families if, for example, schools 
achieve high test scores through high dropout 
rates. While the measures can be constructed with 
a single year of data, they require extensive data 
management and specialized software. 



■ Cost functions. Cost functions may be particu- 
larly useful for evaluating performance and effi- 
ciency in school systems facing severe financial 
constraints. Like EPF measures, they are likely 
to be relatively stable and effective for identify- 
ing persistent differences across schools. Like 
DEA, they include multiple outputs simulta- 
neously. Despite these benefits, though, they are 
likely to impose the most prodigious data require- 
ments and may be the least intuitive of the four 
methods. 

The wide variation in the quality and quantity of in- 
puts that each school faces, along with the variety of 
choices regarding outputs, makes it extremely diffi- 
cult to validly and reliably identify 
schools that are making the most ef- 
fective use of their resources and most 
efficiently achieving their goals. Ulti- 
mately, none of the measures we ex- 
plore in this paper may be well suited 
for drawing the sharp distinctions be- 
tween schools necessary for high- 
stakes accountability systems. 
Unfortunately, though, simplistic 
measures of school performance, 
which do not account for the com- 
plex environment of schooling, risk 
identifying the wrong schools as be- 
ing exemplars of high performance or 
failures in need of interventions (see 
Rubenstein, Schwartz, and Stiefel 
2003). This problem is particularly critical when the 
performance measures are used to distribute rewards 
and sanctions. While the rankings produced with the 
techniques in this paper may be somewhat volatile and 
are often complex, they may produce more valid mea- 
sures of a school’s contribution to student learning than 
do measures that do not attempt to mitigate the ef- 
fects of student socioeconomic status on outcomes. 
Thus, we may face an unavoidable tradeoff between 
simplicity and validity in constructing such measures. 
The efficiency measures examined in this study are 
not simple, but may move us closer to accurately and 
reliably identifying those schools that are making the 
most effective use of their resources to educate their 
students. 



We may face an un- 
avoidable tradeoff 
between simplicity and 
validity in constructing 
efficiency measures. 
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Every January, Education Week releases its annual re- 
port, Quality Counts, which grades the states in sev- 
eral areas including the equity and adequacy of 
resources dedicated to education. The grades are based 
on a series of indicators developed and revised over 
the last few years with the advice of experts in the 
area of school finance. Data, mostly from the Na- 
tional Center for Education Statistics and the U.S. 
Census Bureau, are used to calculate the indices with 
the school district as the unit of analysis. As part of 
the effort to continually ensure the value of the data 
presented in Quality Counts, this paper will take ad- 
vantage of recent court decisions mandating drastic 
changes in state education finance systems to evalu- 
ate the efficacy of the report’s finance adequacy and 
equity measures. 

In the four states studied for this paper — New Hamp- 
shire, New Jersey, Vermont, and Wyoming — state leg- 
islatures have implemented changes in the way the 
state collects and distributes money for education. 
These states were chosen for this analysis because, first, 
the impetus of a court decision in favor of the plaintiff 
forced the legislatures in these states to make sweep- 
ing changes to their education finance systems. Since 



these court-mandated changes tend to be more com- 
prehensive, they should be more easily detected in data 
analyzed from the years before and after reforms were 
implemented. Second, these states were chosen for the 
relatively settled nature of these cases, meaning that a 
court decision and the necessary legislative action have 
taken place — regardless of whether all groups are happy 
with the outcomes. Finally, the timing of the court 
decision in these four states means that the federal data 
used to calculate these measures are available. 

This paper begins by describing the finance measures 
used in Quality Counts, as well as the federal data used 
to calculate these measures. The paper then explains 
the litigation history of the court cases that mandated 
the changes to the education funding systems in each 
of the four states, and when these reforms were imple- 
mented. Next, for each state, based on the court case 
and school finance reform information, outcomes that 
can be expected from the data analysis are listed. Fi- 
nally, this study analyzes the equity and adequacy mea- 
sures calculated from data before and after states 
implemented finance reforms, to see if these indica- 
tors and the data they represent are accurately mea- 
suring school finance changes in the states. 
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Data Sources 

The data used for the analysis described in this study 
were compiled from a number of sources and then 
merged into a single database to create variables for 
each school district in the nation. The following data 
sources and variables were used: 

■ U.S. Census Bureau, Public Elementary/Second- 
ary Education Finance Data (commonly known as 
the National Center for Education Statistics’ Com- 
mon Core of Data Local Education Agency [School 
District] Finance Survey [F-33] Data), 1994-2000 

- State Identification Number (STATE) 

— School System Name (NAME) 

- School Level Code (SCtfLEV) 

- NCES ID Code (NCESID) 

- Year of Data (YRDAT) 

- Fall Membership, October (V33) 

- Total Revenue From State Sources (TSTREV)' 

- Total Revenue From Local Sources (LOCREV) 

- Total Current Spending for Elementary/Sec- 
ondary Programs (TCURELSC) 

■ National Center for Education Statistics, Com- 
mon Core of Data, Public Elementary/Second- 
ary School District Universe Data, 1994-2000 

- NCES Agency ID (LEAID) 

- State Abbreviation (ST##) 

- Agency Type Code (TYPE##) 

- Total Schools (SCH##) 

- Students with an Individualized Education 
Plan (SPECED##) 

■ Chambers Cost of Education Index (1993-94) 

- NCES Agency ID (NLEA_ID) 

- Cost of Education Index (CEIL93) 

■ U.S. Census Bureau, Small Area Income and Pov- 
erty Estimates, School District Estimates, 1995 
and 1997 

- FIPS State Code (FIPS) 



- CCD District ID (CCDID) 

— District Name (DISTNAME) 

- Estimated Total Population (TOTALPOP) 

- Estimated Population of Children 5 to 17 
Years of Age (CHILD) 

- Estimated Number of Poor Children 5 to 17 
Years of Age Who Are Related to the Head of 
the Household (POORCHRN) 

■ School District Data Book, 1990 Census School 
District Special Tabulation, U.S. Summary 

- Table H062 — Aggregate Value of Specified 
Owner-Occupied Housing Units by Mortgage 
Status (WEALTH) 

- Table P202 — Area in Square Kilometers 
(AREA) 

■ National Center for Education Statistics, Com- 
mon Core of Data, Early Estimates of Public El- 
ementary and Secondary School Education Statis- 
tics, school year 1996-97 to school year 2001-02. 

- Table 7: Per pupil expenditure 

These data were compiled for every district in the na- 
tion, and merged into one file using the NCES dis- 
trict code in each data set as the unique identifier. 
Several variables were calculated from these data to 
create the equity and adequacy indicators discussed in 
this study (see the appendix for a detailed description 
of these variables and calculations). 

Equity and Adequacy Indicators 

Each year in its report Quality Counts, Education Week 
uses the most recent data available from the sources listed 
above to grade states on how adequately and equitably 
they fund education. This study uses some of the same 
indicators used in the grading process for Quality Counts. 
For equity, these measures are state equalization effort, 
targeting score, wealth-neutrality score, coefficient of 
variation, and McLoone Index. For adequacy, the indi- 
cators used in this study are adequacy index, education 
spending per student (adjusted for regional cost differ- 
ences), and the percent of students in districts with per 
pupil expenditures at or above the U.S. average. 



This variable represents all state revenue received by each district including revenue from general formula aid, categorical programs, and 
all other revenues from the state. 
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In the grading process for Quality Counts, state equal- 
ization effort accounts for 50 percent of the equity 
grade, wealth-neutrality for 25 percent, and coefficient 
of variation and McLoone Index each count for 12.5 
percent of the grade. For adequacy, education spend- 
ing per student and the adequacy index each count for 
40 percent of the grade. ^ Following is a description of 
each of the measures used in this analysis. 

State equalization effort 

This indicator is based on the concept that states can 
help equalize funding across districts in two ways: by 
providing all or most of the total funding so there are 
no discrepancies across districts, or by targeting more 
revenue to property-poor districts that are not able to 
raise as much revenue locally. Most states use a com- 
bination of these two strategies. The state equalization 
effort indicator measures these two approaches and is 
the state share of total state and local funding adjusted 
by the degree to which these funds are targeted to 
poorer districts. 

The score for state equalization effort depends on both 
a targeting score and the percentage of state funding. 
The targeting score represents the extent to which state 
funds are targeted to property-poor districts. In Qual- 
ity Counts 2002, targeting score values ranged from 
-.53 to zero, where the more negative a value, the more 
state funds are targeted to poorer districts. A targeting 
score of zero means that the state is not targeting funds 
to property-poor districts. 

The targeting score was calculated using multiple re- 
gression. The regression model was designed to con- 
trol for other district characteristics, besides wealth, 
that could influence state aid. The dependent vari- 
able in the model was adjusted state revenue per 
pupil. This variable was adjusted to reflect geographic 
cost differences relative to each state, and was also 
indexed so that the state’s average per pupil figure 
was 1. The independent variables in the model in- 
clude adjusted district wealth per pupil, ^ percent of 
students in poverty, percent of children in special 



education (i.e., those with individualized education 
plans), student enrollment, and land area per pupil 
(all indexed to the state average). The coefficient for 
the first independent variable (the index of adjusted 
state revenue per pupil) from the regression serves as 
the targeting score. 

State equalization effort is the state’s share of funding 
multiplied by the inverse of this targeting score. For 
example, in 2000, state aid in Florida accounted for 
54.7 percent of total (state and local) revenue, which 
was below the national average for that year (57 per- 
cent). Florida, however, had a targeting score of -.473, 
meaning it targeted more funds to property-poor dis- 
tricts. Therefore, its effort to equalize funding was 
higher than what the state share of funding would 
suggest. The calculation for Florida’s state equaliza- 
tion effort for 2000 is as follows: 



State equalization effort 


= State share of funding X (1 - tar- 


geting score) 






= 54.7 percent x (1 - (-.473)) 




= 80.6 percent 



The state equalization effort adjusts the state share of 
funding to reflect the effort the state has made to tar- 
get funds to property-poor districts. If the state’s tar- 
geting score is zero, the state equalization effort will 
be the same as its state share of funding. In Quality 
Counts 2003, state equalization effort values ranged 
from 43 percent to 98 percent. 

Wealth-neutrality score 

Like the targeting score, the wealth-neutrality score 
also shows the degree to which revenue is related to 
the property wealth of districts. Flowever, this indica- 
tor considers both state and local revenue. West 
Virginia, for example, had a targeting score of -.039 
in 2000, indicating that the state is targeting aid to 
property-poor districts. When local revenue was also 
considered in the wealth-neutrality score, however. 
West Virginia had a .084, meaning that higher prop- 
erty wealth is still linked to more revenue. 



^ The remaining 20 percent consisted of measures not used for this analysis — taxable resources spent on education (15 percent) and average 
annual rate of change in expenditures per pupil (5 percent). 

^ The use of wealth versus income in this model was tested with a sensitivity analysis when this indicator was first introduced to Quality 
Counts in 2000. In a comparison of for each state using either wealth or income in the model, wealth often explained more variance 
in state aid than income. 



75 




Developments in School Finance: 2003 



The wealth-neutrality score was also calculated us- 
ing regression. The dependent variable in the model 
was adjusted state and local revenue per weighted 
pupil. The variable was adjusted to reflect geographic 
cost differences relative to each state, and weighted 
by student needs (i.e., poor students = 1.2, and spe- 
cial education students = 2.3). The figure was also 
indexed so that each state’s average per pupil figure 
was 1. The single independent variable in the model 
was adjusted property wealth per weighted pupil, also 
adjusted to reflect cost differences and student needs, 
and indexed to the state average. The coefficient for 
the independent variable (adjusted property wealth 
per weighted pupil) from the regression serves as the 
wealth-neutrality score. 

In Quality Counts 2003, wealth-neutrality scores 
ranged from -.189 to .311. A negative score means 
that, on average, property-poor districts actually have 
more funding per weighted pupil than wealthy dis- 
tricts do, and a positive score means the opposite, that 
wealthy districts have more funding per weighted pu- 
pil than property-poor districts do. Only eleven states 
had a negative wealth-neutrality score in Quality 
Counts 2003. 

McLoone Index 

The McLoone Index is based on the assumption that 
if all students in the state were lined up according to 
the amount their districts spent on them, perfect eq- 
uity would be achieved if every district spent at least 
as much as was spent on the pupil in the middle of 
the distribution, or the median. The McLoone Index 
is the ratio of the total amount spent on pupils below 
the median to the amount that would need to be spent 
to raise all students to the median. 

The McLoone Index was calculated by first computing 
the median-level expenditure per pupil for each state 
(adjusted to reflect cost differences and student needs). 
The second calculation was the total number of dollars 
spent on students whose per pupil expenditure was 
below the median. Finally, that figure was divided by 
the total amount that would be spent if every pupil 
below the median had the median-level expenditure. 

For example, the median-level expenditure per pupil 
(adjusted to reflect student needs) in Indiana for 



Quality Counts 2003 was approximately $5,583. The 
total amount spent on students who were below that 
mark was about $3.04 billion. In order to spend 
$5,583 on each of those pupils below the median, 
the state would need to spend $3.32 billion. The 
calculation for Indiana’s McLoone Index for 2000 is 
as follows: 



McLoone Index 

= Amount spent on pupils below the median / 
Amount needed to be spent to achieve "equity" 
= $3.04 billion / $3.32 billion 
= 91.56 percent 



This indicates that state and local spending on chil- 
dren below the median was about 92 percent of what 
was needed in 2000 to raise all students to the me- 
dian expenditure. McLoone Index values in Quality 
Counts 2003 ranged from 87 percent to 100 percent, 
where perfect equity is represented by 100 percent 
and the greatest inequity by 0 percent. 

Coefficient of variation 

The coefficient of variation is a measure of the dis- 
crepancy in funding across the districts in a state. This 
measure was calculated by dividing the standard de- 
viation of adjusted spending per weighted pupil (ad- 
justed to reflect cost differences and student needs) by 
the state’s average spending per pupil. For example, 
the standard deviation for spending in Maryland in 
2000 was about $584. The average spending per pu- 
pil for Maryland for the same year was $6,265. The 
calculation for Maryland’s coefficient of variation in 
2000 is as follows: 



Coefficient of variation 

= Standard deviation of adjusted spending per 
weighted pupil / Average spending per pupil 
= $584/ $6,265 
= 9.3 percent 



If all districts in a state spent exactly the same amount 
per pupil, its coefficient of variation would be zero. As 
the coefficient gets higher, it means the variation in the 
amounts spent across districts also gets higher. As the 
coefficient gets lower, it indicates greater equity. In 
Quality Counts 2003, the range of values for the coeffi- 
cient of variation was 6 percent to 32 percent. 
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Adequacy index 

Since there is no consensus about how much money 
is necessary to provide an “adequate” education, the 
adequacy index uses the national average as the bench- 
mark against which to gauge state spending. While it 
may seem intuitive to measure adequacy simply by 
calculating the percent of students in districts where 
spending eclipses the national average, that calcula- 
tion is not ideal. Imagine if every district in a state 
were to spend exactly $5,593 per student, just $1 
below the 2000 national average. Spending on every 
student would be amazingly close to what is consid- 
ered to be adequate, yet no student in the state would 
seem to be enrolled in a district with adequate fund- 
ing. The adequacy index takes into account both the 
number (or percentage) of students enrolled in dis- 
tricts with adequate spending, and the degree to which 
spending is below adequate in districts where per pu- 
pil expenditures are below the national average. 

The adequacy index was calculated using district-level 
spending that was adjusted for student needs and 
regional cost differences. Each district where the per 
pupil spending was equal to or exceeded the national 
average received a score of 1 times the number of stu- 
dents in the district. Districts where the adjusted 
spending per pupil was below the national average 
received a score equal to their per pupil spending 
divided by the national average and then multiplied 
by the number of pupils in the district. The adequacy 
index is the sum of district scores divided by the to- 
tal number of students in the state. If all districts 
spent above the U.S. average, the state received a per- 
fect index of 100. 



Example: 



District 


Enrollment 


Per pupil spending 


1 


400 


$7,000 


2 


450 


$6,000 


3 


500 


$5,000 


4 


300 


$4,000 


5 


350 


$3,000 


Total 


2,000 





In the above example, districts 1 and 2 are the only 
ones providing an adequate education (i.e., equal to 



or above the 2000 national average, $5,594). Scores 
for these districts are equal to their student enroll- 
ment. The percent of students attending schools in 
districts with adequate spending, then, is 850 di- 
vided by 2,000, or 42.5 percent. This is the equiva- 
lent of the indicator percent of students in districts with 
per pupil expenditures at or above the U.S. average. This 
figure, however, does not account for how close 
spending is to adequate in the remaining three dis- 
tricts, a problem that is corrected in the calculations 
below. 



District 


Score 


1 


400 


2 


450 



Districts 3 through 5 are below the U.S. average, so 
assigning scores to each district will tell us how “far” 
they are from adequate spending. Their scores are equal 
to their average spending divided by the U.S. average 
and multiplied by the number of pupils in the dis- 
trict, as shown below. 



District 


Score 


3 


446.91 = ($5,000 / $5,594) * 500 


4 


21 4.52 = ($4,000 / $5,594) * 300 


5 


1 87.70 = ($3,000 / $5,594) * 350 


Total 


1,699.13 (for all five districts) 


Adequacy index 


= 1,699.13/2,000 
= 84.96 



This value represents an index against which it is pos- 
sible to compare the relative adequacy of the 50 states 
and the District of Columbia. In Quality Counts 2003, 
values for the adequacy index ranged from 70 to 100. 

Education spending per student 

For this indicator, each state’s education spending per 
student was based on per pupil expenditure data taken 
from the NCES report. Early Estimates of Public El- 
ementary and Secondary School Education Statistics. With 
the Chambers Cost-of-Education Index, each state’s 
per pupil expenditure was adjusted for regional cost 
differences by dividing the expenditure by the state’s 
figure from the cost-of-education index. 
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Methodology 

The purpose of this study was to test the extent to 
which the equity and adequacy indicators used in 
Quality Counts each year represent actual changes in 
the way states collect and distribute funds for educa- 
tion. Four states were chosen as the sample for this 
analysis based on a recent court decision in each state 
and a subsequent change in the education finance sys- 
tem. Information regarding the school finance and liti- 
gation history of these four states — New Hampshire, 
New Jersey, Vermont, and Wyoming — was collected 
from a variety of sources. These included court case 
decisions, legislation on state funding system changes, 
analyses from private groups, the NCES report Public 
School Finance Programs of the U.S. and Canada: 1998— 
99 (National Center for Education Statistics 2001), 
and other sources. 

The equity and adequacy indicators 
used in Quality Counts were calcu- 
lated for each state for the years be- 
fore and after education finance 
reform occurred in the states. The 
data for these indicators came in part 
from previously published data in 
past issues of Quality Counts, and for 
those years that were not covered in 
past issues, an additional analysis was 
conducted for this paper. 

Data were available for all equity in- 
dicators other than the coefficient of 
variation for the years 1996, 1997, 

1999, and 2000 from past Quality Counts publica- 
tions. Data for 1994, 1995, and 1998 were calcu- 
lated for this paper to better detect trends over time. 
Since coefficient of variation has always been used in 
Quality Counts as a measure of equity, data for this 
indicator were available from past issues for the years 
1994-1997, 1999, and 2000. 

For adequacy, there were fewer years of data available 
from past publications since Education Week only re- 
cently started calculating the adequacy index. Results 
for the adequacy index and the percent of students in 
districts with per pupil expenditures at or above the 
U.S. average were only available for 1999 and 2000 
from previous issues. Additional analyses of these in- 
dices were conducted for this paper for 1994, 1995, 



and 1996 data. Like the coefficient of variation, edu- 
cation spending per student is a measure that has al- 
ways been used in Quality Counts adequacy grading, 
so data were available from 1996 to 2002. This indi- 
cator is calculated from more recent data, as it is based 
on the NCES “Early Estimates” report. The most re- 
cent data available for the equity indicators in this 
study, and the data used for the adequacy index and 
percent of students in districts with per pupil expen- 
ditures at or above the U.S. average is from the 1999- 
2000 school year. 

State Finance and Litigation History 

New Hampshire 

Prior to the 1999-2000 school year, the funding sys- 
tem for public education in New 
Hampshire relied heavily on local 
property taxes. On average, 90 per- 
cent of funding came from local prop- 
erty taxes, 7 percent from the state, 
and 3 percent from the federal gov- 
ernment. Under the old system, 
school districts set the annual bud- 
gets for their schools. Once the vot- 
ers approved the budget, the budget 
was sent to the state, and the state 
determined the appropriate property 
tax necessary to raise the funds. In 
New Hampshire, local school boards 
do not have the power to levy taxes. 
While the state has tried at least twice 
before to revise its funding formula to erase disparities 
across districts (in 1919 and 1947) by setting maxi- 
mum property tax rates above which the state would 
provide the necessary support, on both occasions the 
legislature has failed to provide the necessary funds 
(National Center for Education Statistics 2001). 

New Hampshire’s system of state funding that was con- 
tested in court was based on a foundation formula that 
included weights for special education, vocational edu- 
cation, and grade-level enrollments. Local fiscal capac- 
ity was measured by assessed property valuation, school 
tax rates, and personal income. Although the state in- 
tended to fund the average district (based on wealth) at 
8 percent of operating expenditures, every year the ap- 
propriated funds were less than what was needed. The 
state had a few categorical programs in this old system 



The equity and ad- 
equacy indicators used 
in Quality Counts 
were calculated for 
each state for the years 
before and after educa- 
tion finance reform 
occurred in the states. 
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including special education, vocational education, the 
Kindergarten Incentive program, teacher retirement and 
benefits, and capital outlay and debt service (National 
Center for Education Statistics 2001). 

In a series of rulings (Claremont I, II, and III), the New 
Hampshire Supreme Court mandated changes to this 
education funding system. Most notably, the Claremont 
II ruling declared the state’s system unconstitutional 
and ordered that the system may not remain in effect 
beyond the 1998-99 school year. After much legisla- 
tive wrangling, the system that passed the legislature in 
April 1999 under this mandate created a statewide prop- 
erty tax for education, and raised business taxes to cover 
the additional costs of providing an adequate education 
(Viadero 2001). A new system of distributing funds for 
education was implemented in the 1999-2000 school 
year. The process for distributing funds 
in this new system was relatively simi- 
lar to the old system; the major change 
was the collection of revenues through 
the statewide property tax. The state 
also greatly increased its responsibil- 
ity and share of education funding by 
implementing a base cost in the foun- 
dation formula, which was set by ana- 
lyzing the expenditures of a select 
group of schools. This base cost was 
the average per pupil expenditure of 
the lowest spending half of elemen- 
tary schools in districts where 40 to 
60 percent of students scored at or 
above Basic‘S on the New Hampshire 
Educational Improvement and Assessment Program (Na- 
tional Center for Education Statistics 2001). 

An analysis of tax rates and school spending conducted 
by the New Hampshire Center for Public Policy Stud- 
ies looked at the differences in actual spending in the 
state before and after the implementation of the new 
law. According to the study School Finance Reform: The 
First Two Years, while the new legislation did have an 
impact on the amount spent overall on education in 
New Hampshire, the study concluded that disparities 
in spending per student across districts did not change 
greatly (Hall 2002). 



An analysis of tax rates 
and school spending in 
New Hampshire con- 
cluded that disparities 
in spending per student 
across districts did not 
change greatly. 



Based on the reforms implemented, the equity and 
adequacy data for New Hampshire should experience 
the following changes: 

■ The overall assessment of equity in the state 
should remain fairly constant before and after the 
1998-99 school year. Taxation under the new 
system has been more equitable, but the distri- 
bution of spending for education has remained 
fairly constant. The main indicator that should 
change is the state equalization effort since the 
state greatly increased its share of funding. 

■ New Hampshire’s wealth-neutrality score should 
also rise slightly due to the switch to a statewide 
property tax. 

■ In terms of adequacy, the state should improve 

on the adequacy index and education 
spending per pupil after the 1998- 
99 school year, since more money was 
provided across the board for educa- 
tion. 



Because the new education finance 
plan was approved in the late spring 
of 1999, many budgets for the 1999- 
2000 school year had already been 
passed. More changes in the adequacy 
indices should be reflected in the 
2000-01 data, as school boards were 
able to pass budgets with full knowl- 
edge of the new system. 



New Jersey 

The system for funding public education in New Jer- 
sey was first declared unconstitutional in 1973 on 
grounds that it did not meet the “thorough and effi- 
cient” clause of the state constitution, in 1973 
(Robinson v. Cahill, 303 A.2d 273). “Since that de- 
cision, the supreme court has issued over a dozen 
school finance opinions, the latest in May 2002,” (Ad- 
vocacy Center for Children’s Educational Success With 
Standards 2002). By far the most well-known series 
of decisions come from the case of Abbott v. Burke, 
named for Raymond Abbott, an elementary student 
on whose behalf the first suit was filed in 1981, against 



4 



Two scoring levels, Proficient and Advanced, were above Basic, and one scoring level. Novice, was below Basic. 
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then state education commissioner Fred Burke. Nine 
years later, the first state supreme court ruling in the 
case declared that the funding system for poor, urban 
districts was inadequate. One unique aspect to this 
case is that the court addressed only the poorest dis- 
tricts in the state (Newman 1990). 

Before the court required the state to change its edu- 
cation funding system again in 1997, New Jersey state 
education aid came from the general fund (20.1 per- 
cent), and the property tax relief fund, which was rev- 
enue from a state income tax (79.9 percent). Property 
taxes were the only source of local funds for schools, 
exacerbating inequalities in wealth at the local level. 
State aid was distributed with a foundation formula 
that included weights for grade level, vocational school, 
and adult education enrollments, with district shares 
also weighted for property wealth and 
aggregate income. Additionally, in re- 
sponse to the 1990 New Jersey Su- 
preme Court ruling, the state was 
already making adjustments for 30 
special-needs districts, where these 
districts had a foundation level that 
was 5 percent higher than other dis- 
tricts in the state, a different calcula- 
tion for local fiscal capacity, and 
different budget cap rules. New Jer- 
sey also had several categorical pro- 
grams including transportation, 
capital outlay and debt service, 
teacher retirement, special education, 
compensatory education, and private 
school aid (Gold, Smith, and Lawton 1995). 



New Jersey’s reformed system for ensuring equality 
between these poorest districts and the wealthiest in 
the state was made up mostly of supplemental and 
parity aid to these districts. The state still used a foun- 
dation formula to distribute its core curriculum stan- 
dards aid. Pupil counts were used as the basis for 
this formula and were weighted by instructional lev- 
els. A “T&E” (thorough and efficient) budget 
amount was the basis of this formula, and was ad- 
justed for inflation every 2 years. Additional aid was 
provided for the Abbott districts. New Jersey also 
added a categorical program for early childhood edu- 
cation in its new system, which was targeted to low- 
income districts (National Center for Education 
Statistics 2001). 

Based on the changes in the state resulting from Abbott 
V. Burke, the data from these indica- 
tors should change in the following 
ways: 

■ Equity indicators should have a 
clear increase for the state after the 
1997-98 school year, since many of 
the reforms were targeted at bring- 
ing the Abbott districts on par with 
wealthier districts. This should 
mostly be evident in the state’s tar- 
geting score. 

■ For adequacy, the indicators should 
improve slightly since the state in- 
creased funding after the 1997-98 
school year. 



The most prominent 
feature of the finance 
reform efforts in New 
Jersey is the court 
mandated requirement 
for the state to fund 
the 30 poorest districts 
in the state. 



In the years following the first Abbott ruling at least 
three different finance plans were implemented in New 
Jersey, ending with the Comprehensive Educational 
Improvement and Financing Act of 1996 (National 
Center for Education Statistics 2001). This act was 
first implemented in the 1997-98 school year and 
remains in place, although the details of provisions 
within the act have been continually litigated. By far, 
the most prominent feature of the finance reform ef- 
forts in New Jersey is the court mandated requirement 
for the state to fund the 30 poorest districts in the 
state, known as the Abbott districts, at the same level 
as the state’s wealthiest districts. This “Parity Remedy 
Aid” was ordered in the fourth Abbott v. Burke deci- 
sion, in 1997. 



Vermont 

Vermont has had a fluctuating investment in its edu- 
cation system over the last 40 years. According to a 
report from the National Center for Education Statis- 
tics, “between 1964 and 1997, the state share of basic 
educational expenses varied between 20% and 37%” 
(National Center for Education Statistics 2001). The 
report describes a pattern of state funding during that 
time in which Vermont would take legislative action 
to reform the finance formula and raise funding when 
the state share of funding dropped to around 20 per- 
cent. The state would gradually allow the state share 
to drop toward 20 percent again, and then take new 
action. By 1997, the year a court case prompted the 
most recent response from the state, the state share of 
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funding was about 25.3 percent (National Center for 
Education Statistics 2001). 

The Vermont system for financing education that was 
ultimately ruled unconstitutional was based on a foun- 
dation formula. The formula was allocated on average 
daily membership measured over 2 years. Local fiscal 
capacity was based on property value and income, al- 
though income was only a small factor in the formula. 
Average daily membership counts were weighted for 
secondary student enrollment (1.25), poverty (1.25), 
and transportation costs (1.0384 to 1.0714). The state 
foundation cost was $4,025, with no aid going to “gold 
towns” that were able to raise 1.5 times more than 
this level with local resources. There were no mini- 
mum or maximum expenditure limits; local voter will- 
ingness to pay was the only upper limit on these towns. 
The old system included several cat- 
egorical programs including transpor- 
tation, capital outlay and debt 
service, teacher retirement, special 
education, vocational education, pri- 
vate school aid, and early childhood 
education (Gold, Smith, and Lawton 
1995). 

In Brigham v. State, 692 A.2d 384 
(Vt. 1997), the Vermont Supreme 
Court ruled that the state foundation 
aid program and the state system of 
relying heavily on locally raised rev- 
enues were not in line with the re- 
quirements of the Vermont 
constitution. In its decision the court wrote that: 

In Vermont the right to education is so inte- 
gral to our constitutional form of government, 
and its guarantees of political and civil rights, 
that any statutory framework that infringes 
upon the equal enjoyment of the right bears a 
commensurate heavy burden of justification. 
(Brigham v. State, 692 A.2d 384 at 5 [Vt. 
1997]) 

While the decision indicated that equality in per pu- 
pil spending across the state was not a necessary rem- 



edy for the state’s education finance system, it also 
declared that spending in a locality should not be a 
function of property wealth. 

Only 4 months after the ruling, the Vermont legisla- 
ture passed the Equal Education Opportunity Act of 
1997, No. 60 of the Acts of 1997 (Act 60), which 
created an income-sensitive, statewide property tax. 
Under this system, most residents actually paid a state 
tax of 2 percent of their income, but wealthier resi- 
dents and those living in residences on more than 2 
acres of land also continued to pay a 1.1 percent tax 
on their property value (Eleaps and Woolf 1997).^ 
Revenues from this state property tax were distrib- 
uted to the districts at a rate of $5,448 starting in 
1997, with the new legislation requiring an annual 
increase in the per pupil funding matching the most 
recent cumulative price index (Tit. 
16 § 4011 [2003]). Prior to Act 60, 
some districts spent less per pupil 
than what came directly from the 
state after Act 60, yet the median 
spending per pupil in the state be- 
fore 1997 was closer to $6,200 
(Heaps and Woolf 1997). 

In order to spend more than the 
amount offered by the state, towns 
had to implement their own local 
property taxes, and under Act 60, 
these taxes also had to be income sen- 
sitive. In order to meet the court-man- 
dated requirement that school 
spending not be related to town property wealth, any 
local funds generated by a local property tax were sub- 
ject to a state-imposed equalized yield. This ensured 
that a property-poor town levying a 5 percent local 
property tax would get the same additional amount 
per pupil as a property-rich town taxing at the same 
rate, even if the property-rich town generated signifi- 
cantly more revenue. 

An important point about the Act 60 reforms relative 
to this analysis is that the reforms were phased in over 
3 years, so the act was not fully in place until 2001; 
however, the major reforms occurred in 1999 (National 



The Vermont Supreme 
Court ruled that 
spending in a locality 
should not be a func- 
tion of property wealth. 



^ Vermont has since revised the school finance system that was implemented following Act 60. The changes, which will take effect in the 
2004-05 school year, reduce the reliance on property taxes by raising the sales tax rate, eliminate the sharing pool, and increase per pupil 
aid to schools. 
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Center for Education Statistics 2001). In 1999, Ver- 
mont changed its foundation formula to a two-tier sys- 
tem and implemented a statewide property tax. The 
two-tier formula included a block grant as the first 
tier, and a guaranteed-yield program with a recapture 
provision as the second. The new formula was based 
on equalized pupils weighted by grade, poverty (mea- 
sured by food stamp participation), and enrollment of 
English Language Learners. The formula also adjusted 
for small schools, small school enrollment stability, and 
whether a town was a receiving or sending town based 
on the recapture provision (National Center for Edu- 
cation Statistics 2001). 

An early analysis of state education spending data by 
William Mathis at the University of Vermont found 
that disparities in spending per pupil diminished 
across the state, although not as 
much as the disparities in property 
tax rates (Mathis 2000). Vermont 
continued to make progress toward 
the goal of reducing disparities in 
spending, according to a February 
2002 report from the Rural School 
and Community Trust (Jimerson 
2002). This report found that the 
difference in spending between 
property-poor and property-rich 
towns was about 37 percent, or 
$2,100 per pupil, in fiscal year (FY) 

1998 compared to a difference of 
only 13 percent, or $900 per pupil, 
in FY 2002. 

Based on the reforms implemented from Act 60, the 
equity and adequacy data should experience the fol- 
lowing changes: 

■ Adequacy index, average education spending per 
pupil, and the percent of students in districts 
spending at or above the U.S. average should 
improve slightly in the 1998-99 school year as 
students in the poorer districts receive more fund- 
ing. 

■ In terms of equity, the state share of funding and 
targeting scores should increase after 1998, and 
the McLoone Index should also improve as the 
spending in the poorest districts rises to at least 
the level of the state grant. 



■ Wealth neutrality should improve slightly with 
the statewide property tax and recapture provi- 
sion. 

An important aspect of the new system implemented 
with Act 60 was taxpayer reaction in the wealthiest towns 
in Vermont. Residents in these towns experienced the 
largest tax increases, and lost a large share of the money 
raised by these taxes to less fortunate towns. One re- 
sponse by these towns was for residents to forego rais- 
ing local taxes, and instead develop charitable funds to 
give gifts to local schools, thus avoiding the recapture 
provision of Act 60. Mathis points out that “16 towns 
raised $7.3 million in gifts and thereby denied $15.7 
million in recaptured funds” (Mathis 2000). It is im- 
portant to note that if these funds are not included in 
reports to the federal government regarding state fund- 
ing they will not be reflected in the 
indicators used in Quality Counts. 

Wyoming 

Some of the first signs of trouble for 
school finance in Wyoming date back 
to 1980, when the Wyoming Su- 
preme Court first ruled the system 
unconstitutional because it failed to 
offer equal protection as the state con- 
stitution mandates. By 1983, the 
state had implemented reforms that 
required minimum local taxes and 
created a recapture feature to take 
money from wealthier districts to help 
support smaller, rural districts. While 
these changes were supposed to be a temporary fix, 
the system remained in place (Miller 1995). 

Before the Wyoming Supreme Court required changes 
to the finance system again in 1995, the state used 
two main funding sources: a 12-mill statewide prop- 
erty tax and mineral production royalties from the fed- 
eral government. The state system was based on a 
formula allocated by classroom units where local ca- 
pacity was assessed mainly through property valua- 
tion. The formula also had a recapture provision, so if 
a local district’s revenue was greater than 109 percent 
of the state minimum level then the district had to 
return those funds to the state. Wyoming had only 
two main categorical programs, special education and 
transportation (Gold, Smith, and Lawton 1995). 



Taxpayer reaction in 
the wealthiest towns in 
Vermont was for resi- 
dents to forego raising 
local taxes, and instead 
develop charitable 
funds to give gifts to 
local schools. 
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In 1993, four large districts sued the state, and the 
1995 Campbell decision required the state to: 



characteristics. Wyoming still had only two main cat- 
egorical programs for special education and transpor 
tation. 



(1) define the ‘basket’ of education every Wyo- 
ming child should receive — the best we can 
do, not just a minimal education; (2) under- 
take the cost of education studies to determine 
the actual cost of providing the basket in the 
various sizes and types of school districts, tak- 
ing into account the needs of different kinds 
of students; and (3) fund the basket — in that 
order. (National Center for Education Statis- 
tics 2001) 

Thus, the court required not just a remedy to the 
uity problem in the state, but it also answered 
question of what is adequate. 

The state hired an outside contractor 
to conduct the necessary research. By 
April of 1997, Management Analysis 
& Planning Associates submitted their 
proposal to the Wyoming legislature. 

Among the changes suggested was a 
move from a formula based on class- 
rooms, a model which favored smaller, 
more rural districts, to a model built 
around average daily membership. The 
new plan also added cost adjustments 
for cost-of-living, teacher seniority, and 
pupil characteristics, and it made the 
state the authority for determining 
total district revenue and education- 
related taxes (Guthrie et al. 1997). 

Most new provisions for Wyoming’s reformed system 
were put in place for the 1998-99 school year. The 
new school aid formula was mostly a block grant based 
on average daily membership measured over 3 years, 
with adjustments for local cost of living and districts 
with less than 1,350 average daily membership. Local 
capacity was still measured by local average property 
valuation, but the recapture provision was reduced from 
districts that raised 109 percent of the state minimum 
to those able to raise 100 percent. The state developed 
prototypes based on enrollment and class-size levels 
to define the educational basket of costs. This basket 
was built around 25 cost components in five catego- 
ries: personnel, supplies, special services, special stu- 
dents characteristics, and district or regional 



Based on the changes resulting from the Campbell case 
and the ensuing Cost-Based Block Grant model imple- 
mented by the legislature and phased in by the state 
during the 1997-98 and 1998-99 school years 

■ Adequacy in the state should improve slightly 
after the 1998-99 school year due to an addi- 
tional $50 million necessary to enact the recom- 
mended reform on top of the $600 million the 
state already spends annually. 

Funding in Wyoming should become more equi- 
table over this time, because the state now largely 
has control over total spending per stu- 
dent, and because the distribution of 
funding by the state has moved from 
a per classroom basis to a per student 
formula, the same unit of measure 
used in the equity indicators. 

State Results 

New Hampshire 

Until this past year. New Hampshire 
had been one of the worst scoring 
states on the equity grades in Quality 
Counts. Over the last 4 years the state 
consistently received an F on equity 
grades. The main reason for this was 
the state’s low share of education funding, and because 
of that, its low state equalization effort. New Hamp- 
shire always had the lowest state equalization effort of 
the 50 states, with a score in the low teens (table 1). 
On the other hand. New Hampshire had a very good 
targeting score; at one point, in 1997, it was -.734. 
According to these equity indicators, although New 
Hampshire may have had very little state funding for 
education, the state targeted what funding it did pro- 
vide very heavily to property-poor districts. 

In 2000, New Hampshire’s state equalization effort 
skyrocketed to 57.1 percent. This is a very drastic 
change from 14.2 percent the year before and indi- 
cates that the state greatly increased its investment in 
education. Another interesting change in the equity 
indicators from 1999 to 2000 was a large jump in 



eq- 

the 



A/ew Hampshire always 
had the lowest state 
equalization effort of 
the 50 states. 
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Table 1. Changes in equity and adequacy indicators over time in New Hampshire 





1994 


1995 


1996 


1997 


1998 


1999 


2000 


2001 


2002 


State equalization effort 


12.8 


11.2 


10.8 


13.0 


14.1 


14.2 


57.1 


— 


— 


Targeting score 


-.608 


-.584 


-.537 


-.734 


-.547 


-.558 


-.009 


— 


— 


Wealth-neutrality score 


.161 


.174 


.152 


.233 


.174 


.173 


.162 


— 


— 


McLoone Index 


.864 


.860 


.878 


.883 


.876 


.894 


.887 


— 


— 


Coefficient of variation 


17.1 


16.8 


17.5 


16.9 


— 


18.2 


17.5 


— 


— 


Education spending per student 
(adjusted for regional cost 


differences) 


— 


— 


$5,541 


$5,805 


$5,942 


$6,195 


$6,437 


$6,967 


$7,563 


Adequacy index 


87.03 


87.18 


84.76 


— 


— 


91.47 


91.78 


— 


— 


Percent of students in districts 
with per pupil expenditures 


at or above U.S. average 


24.81 


23.0 


20.27 


— 


— 


38.91 


39.55 


— 


— 



— Not available. 

SOURCE: Bureau of the Census: Public Elementary/Secondary Education Finance Data, 1994-2000; Small Area Income and Poverty 
Estimates, School District Estimates, 1995 and 1997; and School District Data Book, 1990. National Center for Education Statistics: 
Common Core of Data: Public Elementary/Secondary School District Universe Data, 1994-2000, and Early Estimates of Public 
Elementary and Secondary School Education Statistics, school year 1996-97 to school year 2001-02; and Chambers Cost of 
Education Index, 1993-94. 



New Hampshire’s targeting score. In 2000, it was al- 
most zero (-.009), indicating almost no targeting of 
funds. This is very significant, considering how heavily 
the state targeted funds to property-poor districts in 
the past. It appears that although the state made great 
strides to increase its funding overall, it no longer tar- 
geted funds as heavily to property-poor districts. 

One change that would be expected with such an in- 
crease in state funding that did not coincide with the 
increases in the state’s equalization effort and target- 
ing score is an increase in New Hampshire’s adequacy 
index. Since the state had such a strong increase in its 
share of funding, it would be expected that the state 
would improve on the adequacy index. Instead, the 
score for this index only rose slightly in 1999 and 2000 
(table 2). When looking at education spending per 
student, however, it is clear that in 2001 and 2002, 
funding in New Hampshire grew substantially. From 
1997 to 1999, education spending per student only 
increased by $390, while from 2000 to 2002, the 
amount of spending per student increased by over 
$ 1 ,000 per pupil. This is a large increase and will likely 
be matched by a jump in New Hampshire’s adequacy 
index when 2001 and 2002 data become available. 

Another indicator that did not change over the years 
1994 through 2000 is New Hampshire’s coefficient 
of variation. This indicator remained fairly steady and 



Table 2. Expectations and results by each 
indicator for New Hampshire 



Indicator 


Expectations 


Results from 
1999 to 2000 


State equalization effort 


T 


T 


Targeting score 


< > 


f 


Wealth-neutrality score 


T slightly 


T slightly 


McLoone Index 


< > 


< > 


Coefficient of variation 


< > 


< > 


Adequacy index 


T 


T slightly 


Education spending 


T 


T slightly 


Percent at or above U.S. average 


T 


T 



NOTE: T = improved; f = worse; < — > = stable 
SOURCE: Bureau of the Census: Public Elementary/Secondary 
Education Finance Data, 1994-2000; Small Area Income and 
Poverty Estimates, School District Estimates, 1995 and 1997; 
and School District Data Book, 1990. National Center for 
Education Statistics: Common Core of Data: Public Elementary/ 
Secondary School District Universe Data, 1994-2000, and 
Early Estimates of Public Elementary and Secondary School 
Education Statistics, school year 1996-97 to school year 
2001-02; and Chambers Cost of Education Index, 1993-94. 



fairly high throughout this time period, meaning that 
the state still has lot of variation in spending across 
districts; in 1999, the coefficient was 18.2, the high- 
est of all the years of data. New Hampshire, according 
to these indexes, still has a great deal of inequity across 
the districts in the state. 
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Table 3. Changes in equity and adequacy indicators over time in New Jersey 




1994 1995 1996 1997 1998 1999 2000 


2001 2002 



State equalization effort 


47.3 


39.8 


43.4 


43.6 


42.5 


55.1 


48.4 — — 


Targeting score 


-.098 


-.134 


-.125 


-.126 


-.155 


-.170 


-.171 — — 


Wealth-neutrality score 


.112 


.132 


.089 


.085 


.092 


.098 


.046 — — 


McLoone Index 


.892 


.894 


.906 


.911 


.904 


.946 


.916 — — 


Coefficient of variation 


— 


12.8 


11.8 


11.5 


— 


11.7 


13.2 — — 


Education spending per student 
(adjusted for regional cost 


differences) 


— 


— 


$8,118 


$8,176 


$8,436 


$8,801 


$8,667 $9,362 $8,328 


Adequacy index 


99.97 


99.94 


99.94 


— 


— 


100 


99.99 — — 


Percent of students in districts 
with per pupil expenditures 


at or above U.S. average 


99.80 


99.15 


99.15 


— 


— 


99.92 


99.78 — — 



— Not available. 

SOURCE: Bureau of the Census: Public Elementary/Secondary Education Finance Data, 1994-2000; Small Area Income and Poverty 
Estimates, School District Estimates, 1995 and 1997; and School District Data Book, 1990. National Center for Education Statistics: 
Common Core of Data: Public Elementary/Secondary School District Universe Data, 1994-2000, and Early Estimates of Public 
Elementary and Secondary School Education Statistics, school year 1996-97 to school year 2001-02; and Chambers Cost of 
Education Index, 1993-94. 



New Jersey 

According to equity and adequacy indicators calcu- 
lated for New Jersey, there was a noticeable spike in 
the figures in 1999 (tables 3 and 4). New Jersey 
showed improvement for that year in its targeting 
score, state equalization effort, McLoone Index, and 
education spending per student. The state had a lower 
targeting score, and a jump in its state share of fund- 
ing. This combination led to an increase in its state 
equalization effort from 42.5 in 1998 to 55.1 in 1999. 
This figure fell back down again slightly in 2000 to 
48.4, but still remained an improvement over previ- 
ous years. The state’s McLoone Index also rose in 1999 
to .946 from .904, indicating that New Jersey was 
closer to having all of its students in districts spend- 
ing at least the median expenditure. Like the state 
equalization effort, the McLoone Index fell again the 
next year, to .916, but was still an improvement over 
previous years. 

New Jersey has a similar pattern in its education spend- 
ing per pupil, only the jump to higher spending did 
not occur until 2001. The state spent more on its stu- 
dents in the year 2001 than in any other year of these 
data. The state made the jump from $8,667 in 2000 
to $9,362 in 2001. Like the state equalization effort 
and McLoone Index, this indicator fell the next year 
back to $8,328. 



Table 4. Expectations and results by each 
indicator for New Jersey 

Results from 



Indicator 


Expectations 


1998 to 1999 


State equalization effort 


T 


T 


Targeting score 


T 


T 


Wealth-neutrality score 


T 


T slightly 


McLoone Index 


T 


T 


Coefficient of variation 


T 


< > 


Adequacy index 


T slightly 


T slightly 


Education spending 


T slightly 


T slightly 


Percent at or above U.S. average 


T slightly 


T slightly 



NOTE: t = improved; i = worse; < — > = stable. 

SOURCE: Bureau of the Census: Public Elementary/Secondary 
Education Finance Data, 1994-2000; Small Area Income and 
Poverty Estimates, School District Estimates, 1995 and 1997; 
and School District Data Book, 1990. National Center for 
Education Statistics: Common Core of Data: Public Elementary/ 
Secondary School District Universe Data, 1994-2000, and 
Early Estimates of Public Elementary and Secondary School 
Education Statistics, school year 1996-97 to school year 
2001-02; and Chambers Cost of Education Index, 1993-94. 



From 1994 to 2000, New Jersey has had steady im- 
provement in its wealth-neutrality score. This improve- 
ment seems to be even more apparent in 1998 and 
1999. New Jersey’s wealth-neutrality score was .092 
in 1998 and .098 in 1999. These scores indicate that 
for these years, when both state and local funding are 



85 





Developments in School Finance: 2003 



Table 5. Changes in equity and adequacy indicators over time in Vermont 




1994 1995 1996 1997 1998 1999 2000 


2001 2002 



State equalization effort 


36.4 


35.1 


34.2 


34.8 


35.0 


87.9 


92.8 — — 


Targeting score 


-.383 


-.364 


-.449 


-.450 


-.401 


-.455 


-.530 — — 


Wealth-neutrality score 


.161 


.152 


.211 


.162 


.182 


.334 


.311 — — 


McLoone Index 


.903 


.892 


.863 


.860 


.889 


.866 


.867 — — 


Coefficient of variation 


19.0 


16.1 


16.2 


18.6 


— 


19.2 


19.9 — — 


Education spending per student 
(adjusted for regional cost 


differences) 


— 


— 


$6,259 


$6,764 


$6,512 


$6,746 


$7,408 $8,622 $9,907 


Adequacy index 


95.49 


93.34 


92.82 


— 


— 


97.01 


97.52 — — 


Percent of students in districts 
with per pupil expenditures 


at or above U.S. average 


53.83 


48.25 


48.33 


— 


— 


68.9 


81.57 — — 



— Not available. 

SOURCE: Bureau of the Census: Public Elementary/Secondary Education Finance Data, 1994-2000; Small Area Income and Poverty 
Estimates, School District Estimates, 1995 and 1997; and School District Data Book, 1990. National Center for Education Statistics: 
Common Core of Data: Public Elementary/Secondary School District Universe Data, 1994-2000, and Early Estimates of Public 
Elementary and Secondary School Education Statistics, school year 1996-97 to school year 2001-02; and Chambers Cost of 
Education Index, 1993-94. 



considered, poor and wealthy districts in New Jersey 
spent similar amounts of money on education. 

There was not a great deal of fluctuation in New 
Jersey’s adequacy indicators, however this was ex- 
pected to some extent since the state’s education 
spending per pupil has consistently been much higher 
than the national average. The state’s adequacy index 
and the percent of students in districts with per pu- 
pil expenditures at or above the national average re- 
mained above 99 from 1994 to 2000. The adequacy 
index reached 100 in 1999, but like the equity indi- 
cators, fell slightly in 2000. 

Vermont 

Vermont had a drastic improvement in its equity grade 
from Quality Counts 2001 to Quality Counts 2002. This 
reflected 1997 and 1999 data, respectively. Over this 
time Vermont had strong changes in two indicators, 
state equalization effort, which makes up 50 percent 
of the equity grade, and wealth-neutrality score (tables 
5 and 6). Vermont’s state equalization effort rose sub- 
stantially. It was 35.0 in 1998, 87.9 in 1999, and 
92.8 in 2000. This is not only a strong change for a 
state in one year, but it also made Vermont a state 
with one of the highest state equalization efforts. Ver- 
mont has also shown improvement in its targeting score 
over these years, most recently having a score of -.530 
in 2000, the best score of the 50 states. 



Table 6. Expectations and results by each 
indicator for Vermont 

Results from 



Indicator 


Expectations 


1998 to 1999 


State equalization effort 


T 


T 


Targeting score 


T 


T 


Wealth-neutrality score 


t slightly 


f 


McLoone Index 


T 


f slightly 


Coefficient of variation 


< > 


f slightly 


Adequacy index 


T slightly 


T slightly 


Education spending 


T slightly 


T slightly 


Percent at or above U.S. average 


T slightly 


T slightly 



NOTE: t = improved; f = worse; < — >= stable. 

SOURCE: Bureau of the Census: Public Elementary/Secondary 
Education Finance Data, 1994-2000; Small Area Income and 
Poverty Estimates, School District Estimates, 1995 and 1997; 
and School District Data Book, 1990. National Center for 
Education Statistics: Common Core of Data: Public Elementary/ 
Secondary School District Universe Data, 1994-2000, and 
Early Estimates of Public Elementary and Secondary School 
Education Statistics, school year 1996-97 to school year 
2001-02; and Chambers Cost of Education Index, 1993-94. 



Unfortunately, although Vermont made gains in state 
equalization effort, it had a worse wealth-neutrality 
score for these same years. Vermont always had a posi- 
tive wealth-neutrality score, meaning that property- 
poor districts, on average, had less state funding per 
weighted pupil than wealthy districts; however, in re- 
cent years its wealth-neutrality score has gotten even 
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Table 7. Changes in equity and adequacy indicators over time in Wyoming 




1994 1995 1996 1997 1998 1999 2000 


2001 2002 



State equalization effort 


51.5 


49.0 


56.0 


56.9 


54.5 


56.6 


58.1 — — 


Targeting score 


.105 


.088 


-.026 


-.093 


-.034 


.000 


-.025 — — 


Wealth-neutrality score 


-.153 


-.151 


-.123 


-.202 


-.203 


-.152 


-.189 — — 


McLoone Index 


.848 


.874 


.948 


.932 


.842 


.934 


.958 — — 


Coefficient of variation 


15.1 


13.6 


14.7 


15.7 


— 


13.0 


12.9 — — 


Education spending per student 
(adjusted for regional cost 


differences) 


— 


— 


$6,499 


$6,297 


$6,590 


$6,790 


$7,853 $8,657 $8,957 


Adequacy index 


96.32 


96.53 


94.47 


— 


— 


100 


100 — — 


Percent of students in districts 
with per pupil expenditures 


at or above U.S. average 


42.92 


43.20 


37.92 


— 


— 


100 


100 — — 



— Not available. 

SOURCE: Bureau of the Census: Public Elementary/Secondary Education Finance Data, 1994-2000; Small Area Income and Poverty 
Estimates, School District Estimates, 1995 and 1997; and School District Data Book, 1990. National Center for Education Statistics: 
Common Core of Data: Public Elementary/Secondary School District Universe Data, 1994-2000, and Early Estimates of Public 
Elementary and Secondary School Education Statistics, school year 1996-97 to school year 2001-02; and Chambers Cost of 
Education Index, 1993-94. 



higher. Vermont’s score rose from .162 in 1997 to .311 
in 2000, and peaked at .334 in 1999. This shows that 
although Vermont has made efforts to increase equity 
in the state, total state and local funding is still linked 
to the property wealth of the districts. 

Vermont increased its education spending per pupil by 
more than $3,000 from 1999 to 2002. Spending in- 
creased from $6,746 per student in 1999 to $9,907 in 
2002. Another adequacy indicator that showed gains over 
this time was the percent of students in districts with per 
pupil expenditures at or above the national average. This 
indicator rose to 81.6 in 2000 from 68.9 in 1999. 

Wyoming 

Throughout the mid-nineties, Wyoming has shown 
improvement on three equity indicators: targeting 
score, state equalization effort, and McLoone Index. 
In 1996, Wyoming’s targeting score became negative, 
meaning that it started to target funds to property- 
poor districts. This trend continued for the rest of the 
years of data analyzed, with the exception of 1999, 
where Wyoming’s targeting score was zero (tables 7 
and 8). The state also increased its state equalization 
effort from 49 in 1995 to 56 percent in 1996. The 
state’s McLoone Index also rose in 1996, from .874 in 
1995 to .948, showing that a greater number of stu- 
dents were in districts with expenditures close to the 
state median. 



Table 8. Expectations and results by each 


indicator for Wyoming 








Results from 


Indicator 


Expectations 


1998 to 1999 


State equalization effort 


T 


T slightly 


Targeting score 


T 


T slightly 


Wealth-neutrality score 


T 


T 


McLoone Index 


T 


< > 


Coefficient of variation 


T 


< > 


Adequacy index 


T slightly 


< > 


Education spending 


t slightly 


T 


Percent at or above U.S. average 


t slightly 


< > 


NOTE: T = improved; 1 = worse; 


< — >= stable. 




SOURCE: Bureau of the Census: Public Elementary/Secondary 
Education Finance Data, 1994-2000; Small Area Income and 
Poverty Estimates, School District Estimates, 1995 and 1997; 
and School District Data Book, 1990. National Center for 


Education Statistics: Common Core of Data: Public Elementary/ 


Secondary School District Universe Data, 1994- 


-2000, and 


Early Estimates of Public Elementary and Secondary School 


Education Statistics, school year 


1996-97 to school year 


2001-02; and Chambers Cost of Education Index, 1993-94. 



Wyoming’s wealth-neutrality score has always been nega- 
tive, meaning that when local and state funding is con- 
sidered, property-poor districts, on average, spend more 
per student than wealthier districts. Most recently, with 
2000 data, Wyoming had the best wealth-neutrality score 
of the 50 states. This indicator has been consistently 
good with only some fluctuation since 1994. 
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Wyoming also had a strong rise in funding per stu- 
dent in recent years, from $6,790 in 1999 to $7,853 
in 2000. This rise in funding continued for the next 2 
years, reaching $8,657 in 2001 and $8,957 in 2002. 
This increase in funding had a strong effect on the 
percent of students in districts with per pupil expen- 
ditures at or above the U.S. average. In 1996, only 
37.9 percent of students were in districts spending at 
or above the U.S. average, while in 1999 and 2000, 
100 percent of students fell into this category. 

Discussion of Results 

For the most part, the equity and adequacy indicators 
used in this analysis correspond to the school funding 
changes documented for each of the four states. Re- 
forms in New Jersey and New Hampshire were very 
well matched to the indicators, and 
Vermont and Wyoming also lined up 
fairly well. There were only two cases 
(in Vermont) where the results were 
the opposite of what was anticipated. 

Twenty-three of a possible 32 indica- 
tors (8 indicators by 4 states) matched 
what was expected from the reforms 
occurring in the states. 

In New Hampshire, all indicators 
matched what was expected with 
the exception of the targeting score, 
which was actually worse in 2000. 

In New Hampshire it was expected 
that the equity picture would not 
change very much, since the distribution of funding 
in the state remained fairly constant even after the 
new finance system was implemented. The only ex- 
ception to this was that the state equalization effort 
was expected to improve with New Hampshire mak- 
ing a greater investment in education. In fact, the 
state share of funding did increase a great deal in 
1999-2000, which was reflected in a strong change 
in the state equalization effort. For adequacy, it was 
expected that there should be much more change 
than equity. The adequacy index and education 
spending per pupil were expected to go up due to 
new funding across the board in New Hampshire. 
Although the adequacy index only rose slightly in 
1999 and 2000, education spending per pupil rose 
substantially between 2000 and 2002. This makes 
it likely that when data are available for district level 



calculations based on 2001 and 2002, the adequacy 
index for those years will be even higher. 

For New Jersey, it was expected that adequacy should 
improve slightly, but that the real gains would be in a 
clear increase in the equity indicators after 1998. This 
analysis found that for New Jersey, all indicators ex- 
cept coefficient of variation matched what was expected 
in light of the reforms implemented by the state. In 

1999, New Jersey in fact had a very noticeable spike 
in its equity indicators, especially its targeting score, 
state equalization effort, and McLoone Index. The state 
also showed improvement in its education spending 
per student, although not until 2001. There was not 
a lot of change in New Jersey’s adequacy indicators, 
however the state has always done fairly well in this 
area. An interesting pattern that emerged in the indi- 
cators for New Jersey is that after this 
spike in 1999, most of the indicators 
fell, which was not expected. This 
result may be tied to the continuing 
battle in the state over funding and 
equity issues. 

For Vermont, five of the eight indica- 
tors in this analysis matched what was 
expected from changes made in the 
state school funding system. Wealth- 
neutrality score, McLoone Index, and 
coefficient of variation did not follow 
the results that were expected. In Ver- 
mont, the greatest change was in the 
state equalization effort for 1999 and 

2000, which matches the expectation that the state 
share of funding would increase after 1998. The tar- 
geting score for Vermont also had a fairly strong im- 
provement in 1999 and 2000. For adequacy, it was 
expected that all three indicators would improve slightly 
from 1998 to 1999, which matched exactly with the 
results. In addition, in 2001 and 2002, education 
spending per pupil in Vermont began to increase rap- 
idly, indicating that in future years of data the ad- 
equacy index for the state should continue to improve. 

Wyoming was the state in this study with the least 
congruence between expected and actual results; only 
four of the eight indicators matched expectations. Most 
of the changes in equity indicators appeared in 1996, 
but reform was not implemented until 1998 and 1999. 
Interestingly, the court made its decision in 1995, so 



The equity and ad- 
equacy indicators used 
in this analysis corre- 
spond to school funding 
changes documented 
for each of the states. 
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the indicator changes in 1996 may have been a result 
of smaller adjustments made in how schools are fi- 
nanced inspired by the ruling and implemented be- 
fore the legislature revised the entire system. One way 
Wyoming did match expectations was in an increase 
in education spending per student. Also, it was ex- 
pected that the adequacy index for the state should 
improve, and the data show that in 1999 and 2000 
Wyoming had a perfect adequacy index. Wyoming also 
greatly increased its percent of students in districts 
with per pupil expenditures at or above the U.S. aver- 
age over this time. 

Conclusion 

This analysis found that indicators for New Hamp- 
shire and New Jersey matched very well with expecta- 
tions from changes these states made in their school 
finance systems, and Vermont and Wyoming’s results 
matched fairly well.*’ Even though there was a strong 
match between indicators and expectations, some in- 
dicators were more accurate than others. According to 
the results for the four states selected for this analysis, 
the state equalization effort (state share of funding and 
targeting score) and the three adequacy indicators were 
well matched to the reforms occurring in the states. 
The other equity indicators, wealth-neutrality score, 
McLoone Index, and coefficient of variation were not 
as clearly matched in all cases, and were less predict- 
able. This reaffirms the weighting system used for grad- 
ing equity in Quality Counts, since state equalization 
effort constitutes half of the grade, and the other three 
indicators together constitutes the other half. 

It is encouraging to see that for the most part the 
indicators used in Quality Counts reflect the court- 



mandated changes that occurred in these states. This 
shows that it is possible to develop accurate assump- 
tions about what these states were doing with their 
education finance systems based on these indices and 
the data they are derived from. Interestingly, although 
it is reassuring to see that these indicators are in fact 
reflecting true changes in state policy, it is impor- 
tant to have the context of what has happened or is 
happening in a state when making assumptions based 
on these numbers. 

Another interesting factor evident from this analysis is 
the problem of the time lag in the availability of fed- 
eral school finance data because of the difficulty in 
collecting and standardizing data across all 50 states 
and the District of Columbia. For Quality Counts 2003, 
the most recent data available for these indicators re- 
flected the 1999-2000 school year. During the time 
between when these data are collected and when they 
are published, the states could have implemented dras- 
tic changes to their school finance systems. This is 
another reason that contextual information — especially 
current information — is important to consider when 
making assumptions about these indicators and how 
they are changing over time. 

In part due to the results of this study. Education Week 
is conducting a state policy survey on how states raise 
revenues and distribute funds for education. Some of 
these data will be published in Quality Counts 2004, 
and more will be included in a regular issue of Educa- 
tion Week in the winter of 2004. These data will serve 
to not only help inform the indicators described in this 
paper and used in Quality Counts, but also will be a 
tool for school finance researchers to use as current back- 
ground information and context for their analyses. 



6 



One reason Vermont and Wyoming may have had less clear results from this analysis is that reforms in these states were implemented over 
several years, and this analysis looked for changes in indicators in the single year where the most changes occurred. 
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Appendix 

After the data files were downloaded and merged for 
the most recent year available, the next step was to 
eliminate districts that met certain characteristics. 
Districts with certain characteristics were eliminated 
because the purpose of this analysis was to measure 
equity and adequacy in public elementary and sec- 
ondary schools only. Districts were deleted if they met 
the following conditions: school levels other than el- 
ementary, secondary, or unified (SCHLEV should be 
1, 2, or 3 only), no schools (SCH## = 0), state or 
federal level (TYPE## = 3, 4, 5, 6, or 7), and fall mem- 
bership (V33) less than 200. 

After eliminating appropriate data, the next step was 
to calculate the variables needed for the indicators. 
Following is a list of all the equations and calculations 
that were used in this study. 

1 . Adjusted State Revenue per Pupil Index 

a. Total State Enrollment (TSE) = (Z V33 for each 
state) 

b. Share of Total State Enrollment (STSE)=V33 / 
TSE 

c. State-Indexed Cost of Education Index (SICEI) 
= CEIL93 / (I (STSE * CEIE93) for each state) 

d. Adjusted Cost of Education Index (ADJCEI) = 
(0.85 * SICEI) + 0.15 (Assign a 1 if missing data) 

e. Adjusted State Revenue (ADJSTRV) = (TSTREV 
/ ADJCEI) 

f Adjusted State Revenue per Pupil (ADSTRVPP) 
= (ADJSTRV * 1000) / V33 

g. Total Adjusted State Revenue for Each State 
(TADSTREV) = (I ADJSTRV for each state) 

h. Average Adjusted State Revenue per Pupil for 
Each State (AVGASTPP) = (TADSTREV / TSE) 
* 1000 

i. Adjusted State Revenue per Pupil Index 
(ADSPPIND) = (ADSTRVPP / AVGASTPP) 

2. Adjusted District Wealth per Pupil Index 

a. Adjusted District Wealth (ADJWLTEI) = 
(WEALTH / ADJCEI) 

b. Total Adjusted District Wealth (TADWTH) = 
(Z ADJWLTH for each state) 

c. Adjusted District Wealth per Pupil (ADJDWPP) 
= (ADJWLTH / V33) 

d. Average Adjusted District Wealth per Pupil 
(AVGDWPP) = (TADWTH / TSE) 



e. Adjusted District Wealth per Pupil Index 
(ADDWIND) = (ADJDWPP / AVGDWPP) 

3. Percent of Students in Poverty Index 

a. Estimated Percentage of Children 5 to 17 in Pov- 
erty (POVPER) = (POORCHRN / CHILD) 

b. Estimated Number of Children in Poverty 
(NUMPOV) = POVPER * V33 

c. Total Estimated Number of Children in Poverty 
for Each State (TPOVST) = (Z NUMPOV for 
each state) 

d. Average Percentage of Children in Poverty for Each 
State (POVAVG) = (TPOVST / TSE) 

e. Percent of Students in Poverty Index 
(POVINDEX) = (POVPER / POVAVG) 

4. Percent of Special Education Students Index 

a. Total Number of Special Education Students in 
Each State = (Z SPECED## for each state) 

b. Percent of Enrollment that is Special Education 
(PERIEP) = (SPECED## / V33) 

c. Average Percent Enrollment for Each State 
(lEPAVG) = (TOTSPCED / TSE) 

d. Percent Special Education Students Index 
(lEPINDEX) = (PERIEP / lEPAVG) 

5. Enrollment-Squared Index 

a. District Enrollment Squared (V33_2) = (V33)^ 

b. Total District Enrollment Squared for Each State 
(TOTV33_2) = (Z V33_2 for each state) 

c. District Enrollment Squared per Pupil Index 
(ENSQUIND) = (V33_2 / TOTV33_2) / (V33 
/TSE) 

6. Land Area per Pupil Index 

a. Area per Pupil (AREAPP) = (AREA / V33) 

b. Total Area per State (TSAREA) = (Z AREA for 
each state) 

c. Average Area per Pupil for Each State 
(AVGARPP) = (TSAREA / TSE) 

d. Area Per Pupil Index (AREAINDX) = (AREAPP 
/ AVGARPP) 

7. Weighting Variable (WGT) = (V33 / TSE)* 

Number of Districts in Each State 

8. State Share of Funding 

a. Adjusted Local Revenue (ADJLOCRV) = 
(LOCREV / ADJCEI) 
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b. Adjusted State & Local Revenue (STLOCREV) 
= (ADJLOCRV + ADJSTRV) 

c. Total Adjusted State & Local Revenue (TASLR) 
= (Z STLOCREV for each state) 

d. State Share of Funding for Each State 
(STSHARE) = (TADSTREV / TASLR) 

9. Wealth-Neutrality Score 

a. Weighted Enrollment (WGTENROL) = 
(SPECED## * 1.3) + (NUMPOV * 0.2) + V33 

b. Total Weighted Enrollment for Each State 
(TOTWTENR) = (I WGTENROL for each 
state) 

c. Adjusted Local Revenue (ADJLOCRV) = 
(LOCREV / ADJCEI) 

d. Adjusted State & Local Revenue (STLOCREV) 
= (ADJLOCRV + ADJSTRV) 

e. Adjusted State & Local Revenue per Weighted 
Pupil (ASLRPWP) = (STLOCREV / 
WGTENROL) 

f Total Adjusted State & Local Revenue (TASLR) 
= (£ STLOCREV for each state) 

g. Average Adjusted State & Local Revenue per 
Weighted Pupil (AVASLRWP) = (TASLR / 
TOTWTENR) 



h. Adjusted State & Local Revenue per Weighted 
Pupil Index (ASLRPIND) = (ASLRPWP / 
AVASLRWP) 

i. Adjusted District Wealth per Weighted Pupil 
(ADWWP) = (ADJWLTH / WGTENROL) 

j. Average Adjusted District Wealth per Weighted 

Pupil (AADWPWP) = (TADWTH / 

TOTWTENR) 

k. Adjusted District Wealth per Weighted Pupil 
Index (ADJWWP) = (ADWWP / AADWPWP) 

10. McLoone Index 

a. Adjusted Expenditures (ADJEXP) 

(TCURELSC / ADJCEI) 

b. Adjusted Expenditures per Weighted Pupil 
(ADJPWP) = (ADJEXP * 1000) / WGTENROL 

11. Adequacy Index 

a. Adjusted Cost of Education Index for Adequacy 
Indicators (ADJCEIAD) = (State level CEI for 
districts missing CEI data, else CEI) 

b. Adjusted Expenditures for Adequacy Indicators 
(ADJEXPAQ^) = (TCURELSC / ADJCEIAQ) 

c. Adjusted Expenditures per Weighted Pupil for 
Adequacy Indicators (ADJPWPAQ) 
(ADJEXPAQ * 1000) / WGTENROL) 
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Introduction 

In June of 1997, the elected leaders of Vermont en- 
acted the Equal Educational Opportunity Act (Act 
60) in response to a state supreme court decision in 
the Brigham v. State case. Act 60 may well have repre- 
sented the most radical reform of a state’s system of 
public school financing since the '^oit-Serrano, post- 
Proposition 13 changes in California in the late 1970s. 
As a result. Act 60 could provide a unique opportu- 
nity to determine if dramatic school finance reforms 
like those enacted in Vermont generate greater equal- 
ity in measured student performance. This paper rep- 



resents an attempt to document the changes in the 
distributions of spending and of student performance 
that have occurred in the post-Act 60 period. 

This paper begins with an overview of the institutional 
structure of educational finance and provision in Ver- 
mont. One purpose of this overview is to make the 
argument that the Vermont case is particularly inter- 
esting because there have not been dramatic demo- 
graphic changes in the state that could obscure the 
impact of finance reforms. With this context estab- 
lished, I review the research on the links between fi- 
nance reforms and the distributions of education 
spending and of student performance. After briefly dis- 
cussing the data utilized, I examine the extent to which 
there has been convergence across school districts in 
expenditures and in student performance. 

All of the available data support the conclusion that 
the link between spending and taxable resources has 
been significantly weakened and that spending, how- 
ever it is measured, is substantially more equal. I also 
present evidence that the cross-district dispersion of 
the performance of fourth-graders on standardized tests 
of mathematics has declined post-Act 60. And there 



95 



Developments in School Finance: 2003 



is no evidence of increased cross-district dispersion of 
the test performances of second- and eighth-graders. 

School Finance Reform in Vermont 

In 1995 in Lamoille Superior Court, a group of plain- 
tiffs that included Amanda Brigham, a 9-year-oId stu- 
dent in the Whiting School District (Burkett 1998), 
fded suit against the State of Vermont. The goal of 
the suit was to force substantive reform of a system 
of school financing that the plaintiffs felt deprived 
students in property-poor school districts of equal 
educational opportunities and forced taxpayers in 
these property-poor districts to assume a dispropor- 
tionate burden of the financing of public education. 
On February 5, 1997, the Supreme Court of the State 
of Vermont ruled in favor of the plaintiffs, conclud- 
ing that the existing system deprived 
“children of an equal educational op- 
portunity in violation of the Ver- 
mont Constitution” (Brigham v. 

State, 166 Vt. at 249). The court 
left it to the legislature to craft a new 
financing system that was consistent 
with the state constitution. 

The focus in the plaintiffs’ suit on 
both inequalities in educational 
spending and disparities in property 
tax burdens grew out of longstanding 
dissatisfaction in Vermont with the 
existing foundation system of educa- 
tion financing and the existing sys- 
tem of property taxation. Prior to Act 
60, Vermont used a traditional foundation formula to 
determine the state aid a town received: 

(1) Total state aid = (Weighted average daily mem- 
bership) * (Foundation amount) - (Foundation 
tax rate) * (Aggregate fair market value) * 0.01, 

where the weighted average daily membership was 
determined by assigning weights of 1.25 to secondary 
students and students receiving food stamps, assign- 
ing weights of between 1.0385 and 1.0714 to stu- 
dents who must be transported to school, and 
averaging the weighted counts from the previous 2 



school years (Mathis 1995). While the foundation 
amount was set with the intent of permitting districts 
spending that amount to meet state standards for those 
students assigned a weight of 1, fluctuations in the 
state’s fiscal status led the state legislature to adjust 
the foundation tax rate so as to reduce the state’s aid 
liability. As a result, the state share of basic educational 
expenditures fluctuated between 0.20 and 0.37, with 
the share declining when the state economy weakened 
(Mathis 2001). The period leading up to Act 60 was a 
period of decline in the state share. 

The widespread dissatisfaction with the existing school 
financing system had not been ignored by elected offi- 
cials. In both 1994 and 1995, the state house of the 
Vermont Legislature approved legislation designed to 
overhaul education financing. While this legislation 
failed to pass the state senate, the leg- 
islation contained key elements of the 
eventual response to the Brigham de- 
cision (McClaughry 2001). 

The legislation, by highlighting con- 
cerns about education financing and 
property taxation, also influenced the 
dynamics of the 1996 election. The 
state senate that was elected in 1996 
was committed to property tax reform 
(Mathis 2001). The result was a state 
legislature that was ready to move on 
legislation that would comply with the 
Brigham decision and reduce the prop- 
erty tax burdens of poor individuals. 

Given the political dynamic in Vermont, the speed with 
which Act 60, the legislation designed to comply with 
Brigham and to provide property tax relief, was passed 
surprised no one. Signed into law on June 26, 1997, 
Act 60 created a system of school financing that com- 
bined elements of foundation and power equalization 
plans. A statewide property tax was established, with 
revenues from the tax being used to finance a portion 
of foundation aid.' If in a locality property tax rev- 
enues generated by levying the statewide rate exceed 
the amount needed to finance the foundation level of 
spending, the excess property tax revenues are recap- 
tured by the state. 



The Vermont court, 
concluding the existing 
system deprived "chil- 
dren of an equal educa- 
tional opportunity in 
violation of the Vermont 
Constitution," left it to 
the legislature to craft a 
new financing system. 



' In the 2000-2001 school year, the nominal property tax rate was 1.1 percent, and the foundation level was $5,200. 
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Under Act 60, localities are allowed to choose spend- 
ing levels in excess of the foundation level. To weaken 
the link between property wealth and spending in ex- 
cess of the foundation level, the act established a power 
equalization scheme that insured that localities with 
the same nominal tax rates would have the same levels 
of education spending. The power equalization scheme 
also included a unique recapture element; all spend- 
ing in excess of the foundation level is drawn from a 
sharing pool that consists of all the property tax rev- 
enues generated by property tax rates in excess of the 
statewide rate. As a result, no revenues from statewide 
taxes are used to finance the power equalizing portion 
of the school finance system. Further, when the voters 
in a locality choose a nominal property tax rate above 
the statewide rate, the revenues that will be available 
for that locality’s schools will not be known with cer- 
tainty until all other localities have 
made their taxing decisions and the 
size of the sharing pool is established. 

While the Brigham decision forced 
state policymakers to implement fi- 
nance reforms, the reality was that Act 
60 was as much about property tax 
relief as it was about school finance 
reform. For taxpayers in many com- 
munities, the finance reforms by 
themselves would have dramatically 
reduced tax burdens by allowing lo- 
calities to maintain or even increase 
education spending with substantially 
lower tax rates. At the same time, tax- 
payers in high-wealth communities, which have been 
labeled “gold towns,” necessarily faced increases in their 
property tax payments.^ To lessen the burden on low- 
income residents of the “gold towns,” the drafters of 



Act 60 included in the legislation a provision that 
granted tax adjustments to certain homestead owners. 
These tax adjustments were explicitly linked to the 
taxpayer’s income; the original legislation specified that 
all owners with incomes at or below $75,000 were 
eligible for adjustments. 

All of these changes in the property tax were clearly 
designed to shift some of the burden of financing 
Vermont’s schools away from state residents to corpora- 
tions and nonresident owners of property in Vermont.^ 

Thus, Act 60 continues the recent tradition of linking 
school finance reforms and tax relief that is exempli- 
fied by Michigan. For this reason, any complete evalu- 
ation of the success of Act 60 must consider both the 
changes in education provision and the changes in tax 
burdens. Therefore, this paper nec- 
essarily provides a partial view of the 
welfare implications of Act 60. 

The school finance reforms that were 
the central element of Act 60 were 
phased in over several years, with the 
new regime not fully in place until 
the 2000-2001 academic year. Nev- 
ertheless, as was true in California in 
the aftermath of Serrano and Propo- 
sition 13, in some districts there were 
surprisingly rapid responses to Act 
60. Not surprisingly, in the gold 
towns there was vocal opposition to 
Act 60. Also unsurprising, given the 
California experience, were the efforts in these towns 
to encourage residents to make voluntary contribu- 
tions to the schools^ and to shift to town governments 
responsibility for financing certain “school” functions. 



The reality was that 
Act 60 was as much 
about property tax 
relief as it was about 
school finance reform. 



^ In the 1994-95 school year, 69 of the 248 towns in Vermont for which data were available had effective education property tax rates 
below $1.10 per $100 in assessed value. While the percentage of towns with effective education rates below $1.10 had undoubtedly 
declined by the 1 997-98 school year, the last year before the phasing in of Act 60 began, the reality was still that Act 60 forced a sizable 
fraction of the towns in Vermont to increase property tax rates. 

^ The correlation between each town’s effective education property tax rate in 1994-95 and the fraction of that town’s property that was 
owned by town residents in 1998-99 was 0.5461. In other words, towns with low effective property tax rates prior to Act 60 also tended 
to be towns in which a large fraction of the property tax burden was exported. 

See Heaps and Woolf (2000) and Jimerson (2001) for efforts to evaluate both the implications of Act 60 for educational provision and 
the effects of Act 60 on property tax burdens. 

^ Since towns that collected sufficient funds from individual contributions could avoid participating in the sharing pool, in most gold towns 
education funds were established and property owners were encouraged to contribute to these funds. Participation rates varied across 
towns, ranging up to 87 percent in Manchester, where aggressive tactics, such as publication of the names of nonparticipants, were used 
to encourage giving. More traditional incentives were also used to encourage giving; in fiscal years 1999, 2000, and 2001, the Freeman 
Foundation matched individual donations to the funds. 
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Another California parallel is the apparent growth in 
private school enrollments that have been mentioned 
in press reports. 

Care needs to be taken, however, to avoid making too 
much of the California parallels. Act 60 gave Vermont 
school districts much more discretion over the level of 
expenditures than California districts have. The tax 
price of education spending has increased in the gold 
towns, but spending is not being forcibly leveled down 
as it was in California. Also, low-wealth towns were 
not required to maintain local effort; several towns used 
the Act 60 windfall primarily to reduce nominal prop- 
erty tax rates. As a result, low-wealth towns were not 
necessarily leveled up. The reality is that Act 60 did 
not duplicate the California reforms; a fact on which 
the next section expands. 

Act 60 did not duplicate the Califor- 
nia reforms in one other important 
way. The California reforms predated 
the nationwide push for accountabil- 
ity; Act 60 was passed at a time when 
most states were attempting to 
strengthen accountability and educa- 
tional standards. As a result, several 
elements of the legislation built on 
the existing system of testing and 
standards to strengthen accountabil- 
ity. For example, under Act 60 all 
districts were required to develop ac- 
tion plans to improve student perfor- 
mance on the tests that are part of 
the Vermont Comprehensive Assessment System. In 
addition, the state board of education was mandated 
to take on a more active oversight role. Nevertheless, 
the central elements of the state’s accountability sys- 
tem were unaffected by Act 60. 

Why Study Vermont? — A Review of 
Research on the Impact of School 
Finance Reforms 

The school finance reforms implemented in Cali- 
fornia in the aftermath of that state’s supreme court 



Act 60 gave Vermont 
school districts much 
more discretion over 
the level of expendi- 
tures than California 
districts have. 



decision in the Serrano v. Priest case and of the nearly 
contemporaneous tax limits imposed by Proposition 
13 represent a watershed both in the debate over 
the structure of school finance reforms and in the 
direction of research into the impact of those re- 
forms. In the post-Serrano period, the California 
reforms and their supposed effects on the schools in 
that state have been discussed in every state in which 
school finance reforms have been implemented. Ver- 
mont is no exception; the supposed parallels be- 
tween the California reforms and Act 60 have been 
mentioned repeatedly.'’ 

The California reforms also shifted the focus of research 
on the impact of school finance. Prior to the reforms, 
the focus in the literature was almost solely on the 
impact of finance reforms on spending inequality. Af- 
ter Serrano, the scope of the analysis 
broadened to include the impact of 
finance reforms on the level and dis- 
tribution of student achievement, on 
housing prices, on the supply of pri- 
vate schooling, and even on the com- 
position of affected communities.^ 
The California reforms also became 
the touchstone for theoretical work. 
Papers like those of Nechyba (1996, 
2000), Benabou (1996), and 
Fernandez and Rogerson (1997, 
1998) used a California-like system 
as the post-reform case when trying 
to reach predictions about the likely 
effects of finance reform. 

The problem with using the California case as a bench- 
mark is that the case has proven to be the exception, 
not the rule. First, the limits imposed on local control 
over spending have not been duplicated in any other 
state. Even in Michigan and Vermont, the states in 
which the most extensive post-Serrano reforms have 
been implemented, some degree of local control over 
taxes and spending is permitted. Further, the popula- 
tion of students served by California schools changed 
more dramatically than the population of students in 
any other state in the nation. From 1986 to 1997, the 
percentage of the California public school student 



'■ For examples of references to California, see McClaughry (1997) and Mathis (1998). 

’’ The number of papers dealing with these varied topics are too numerous too cite. Evans, Murray, and Schwab (1999) and Downes and 
Figlio (1999, 2000) cite many of the relevant papers. 
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population identified as minority increased from 46.3 
percent to 61.2 percent. Nationally, the percent mi- 
nority grew far more slowly, from 29.6 percent to 36.5 
percent.® As Downes (1992) notes, these demographic 
changes make it difficult to quantify the impact of the 
finance reforms in California on the cross-district in- 
equality in student achievement. 

The possibility that California might be the exception 
and not the rule pushed a number of researchers to 
pursue national-level studies attempting to document 
the impact of finance reforms. On the spending side, 
Silva and Sonstelie (1995), Downes and Shah (1995), 
and Manwaring and Sheffrin (1997) all took slightly 
different approaches to quantifying the effect of finance 
reforms on mean per pupil spending in a state. Be- 
cause they used district-level data, Hoxby (2001), 
Evans, Murray, and Schwab (1997), 
and Murray, Evans, and Schwab 
(1998) were able to consider not only 
the effects of finance reforms on mean 
spending but also the extent to which 
spending inequities were reduced by 
those reforms. As a result, these stud- 
ies provide the most obvious sources 
for predictions of the long-run effects 
of Act 60. The problem is that these 
studies generate contradictory predic- 
tions. Hoxby’s results would lead us 
to expect leveling down, since Act 60 
dramatically increases tax prices in 
towns with more property wealth. 

Murray, Evans, and Schwab conclude 
that court-mandated reforms like Act 60 typically re- 
sult in leveling up. 

The same lack of a clear prediction would be appar- 
ent to the reader of national-level attempts to deter- 
mine how the distribution of student performance 
in a state is affected by a finance reform. Hoxby 
(2001) represents the first attempt to use national- 
level data to examine the effects of finance reforms 



on student performance. She finds that dropout rates 
increase about 8 percent, on average, in states that 
adopt state-level financing of the public schools. Al- 
though Hoxby’s work does not explicitly address the 
effect of equalization on the within-state distribu- 
tion of student performance, it seems likely that much 
of the growth in dropout rates occurred in those dis- 
tricts with relatively high dropout rates prior to 
equalization. In other words, these results imply that 
equalization could adversely affect both the level and 
the distribution of student performance. 

While the dropout rate is an outcome measure of con- 
siderable interest, analyses of the quality of public edu- 
cation in the U.S. tend to focus on standardized test 
scores and other measures of student performance that 
provide some indication of how the general student 
population is faring. Husted and 
Kenny (2000) suggest that equaliza- 
tion may detrimentally affect student 
achievement. Using data on 34 states 
from 1976-77 to 1992-93, they 
find that the mean SAT score is higher 
for those states with greater within- 
state spending variation. However, 
the period for which they have test 
score information, 1987-88 to 
1992-93, postdates the imposition 
of the first wave of finance reforms. 
Thus, the data do not permit direct 
examination of the effects of policy 
changes. In addition, because they use 
state-level data, Husted and Kenny 
cannot examine the degree to which 
equalization affects cross-district variation in test 
scores.^ Finally, since only a select group of students 
take the SAT, Husted and Kenny are not able to con- 
sider how equalization affects the performance of all 
students in a state. 

Card and Payne (2002) explore the effects of school 
finance equalizations on the within-state distributions 



Dropout rates increase 
about 8 percent, on 
average, in states that 
adopt state-level 
financing of the public 
schools. 



® Generating comparable numbers for earlier years is difficult. Nevertheless, the best available data support the conclusion that these sharp 
differences in trends in the minority share predate the 5erraKO-inspired reforms. For example, calculations based on published information 
for California indicate the percentage minority in 1977-78 was approximately 36.6 percent. Nationally, estimates based on the October 
1977 Current Population Survey indicate the percent minority was 23.9 percent. 

’ Husted and Kenny do find evidence consistent with the conclusion that, in states which have school finance reforms, these reforms have 
no impact on the standard deviation of SAT scores. Since, however, the standard deviation of test scores could be unchanged even if cross- 
district inequality in performance had declined, this evidence fails to establish that finance reforms do not reduce cross-district performance 
inequality. 
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of SAT scores. They characterize a school finance 
policy as more equalizing the more negative the 
within-state relationship between state aid to a school 
district and school district income is. They find that 
the SAT scores of students with poorly educated par- 
ents (their proxy for low income) increase in states 
that, under their definition, become more equalized. 
Data limitations, however, make it impossible for 
Card and Payne to examine the effects of policy 
changes on students residing in school districts in 
which the changes had the greatest impact. More- 
over, while Card and Payne correct for differences in 
the fractions of the population taking the SAT test, 
it is still very likely that the students who come from 
low-education backgrounds but take the SAT test are 
a very select group and are extremely unlikely to be 
representative of the low-income or low-education 
population as a whole.'” 

Downes and Figlio (2000) attempt 
to determine how the tax limits and 
finance reforms of the late 1970s and 
early 1980s affected the distribution 
of student performance in states in 
which limits were imposed and how 
student performance has changed in 
these states relative to student per- 
formance in states in which no limits 
or finance reforms were imposed. The 
core data used in the analysis were 
drawn from two national data sets, the 
National Longitudinal Study of the 
High School Class of 1972 (NLS:72) 
and the 1992 (senior year) wave of the National Educa- 
tion Longitudinal Study of 1988 (NELS:88/92). The 
NELS data were collected sufficiently far from the 
passage of most finance reforms to permit quantifica- 
tion of the long-run effects of these reforms by ana- 
lyzing changes in the distributions of student 
performance between the NLS:72 cross-section and 
the NELS cross-section. 

Downes and Figlio find that finance reforms in response 
to court decisions, like that in the Brigham case, result 
in small and frequently insignificant increases in the 
mean level of student performance on standardized tests 



of reading and mathematics. Further, they note that 
there is some indication that the post-reform distribu- 
tion of scores in mathematics may be less equal. This 
latter result highlights one of the central points of the 
paper; any evaluation of finance reforms must control 
for the initial circumstances of affected districts. The 
simple reality is that finance reforms are likely to have 
differential effects in initially high-spending and ini- 
tially low-spending districts. 

The fundamental reason for the absence of clear pre- 
dictions of the impact of finance reforms has been 
mentioned by a number of authors (e.g., Downes 
and Shah [1995], Hoxby [2001], Evans, Murray, and 
Schwab [1997]), all of whom have emphasized the 
tremendous diversity of school finance reforms. In a 
national-level study, any attempt to classify finance 
reforms will be imperfect. So, even 
though there is general consensus 
that the key elements of a finance 
reform are the effects of the reform 
on local discretion, the effects of the 
reform on local incentives, and the 
change in state-level responsibilities 
in the aftermath of reform (Hoxby 
[2001], Courant and Loeb [1997]), 
different authors take different ap- 
proaches to account for the hetero- 
geneity of the reforms. The result is 
variation in predictions generated by 
studies that are asking the same fun- 
damental question. The answer, it 
seems, is not to try to improve the 
methods of classifying reforms, but is instead to care- 
fully analyze certain canonical reforms. Act 60 is likely 
to be just such a canonical reform. 

In looking for guidance for an analysis of the Vermont 
reforms, the first case to consider is that of Kentucky, 
where the reforms that followed a court decision in- 
validating the system of school finance may represent 
the most radical change to a state’s system of public 
schooling provision. Flanagan and Murray (2002) 
document the effects of the reforms in Kentucky. Un- 
fortunately, because the reforms in Kentucky were so 
extensive, any lessons from that case are probably not 



Finance reforms in 
response to court 
decisions result in 
small and frequently 
insignificant increases 
in the mean level of 
student performance. 



For instance, among the students in Card and Payne’s low-parental-education group, in 28 states in 1978 (25 states in 1990) fewer than 
10 percent took the SAT examination and in 20 states in 1978 (15 states in 1990) fewer than 3 percent took the SAT. Further, in 1978 
no state had more than 36.2 percent of the low-parental-education group take the SAT. 
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particularly relevant for those attempting to predict 
the effect of reforms that, like Act 60, primarily affect 
the system of school finance. 

Thus, the most direct antecedent in this case-study 
approach to analyzing finance reforms is Downes 
(1992), who showed that the extensive school finance 
reforms in California in the late 1970s generated 
greater equality across school districts in per pupil 
spending but not greater equality in measured stu- 
dent performance. Duncombe and Johnston’s (2002) 
work on Kansas offers an example of a recent case 
study of a canonical reform. This study of Vermont 
is another such example. Will the outcomes in Ver- 
mont duplicate those in California? What are the simi- 
larities in and differences between the results for 
Vermont, Kansas, and Kentucky? The data used to 
answer these questions are described 
in the next section. 



The norm in Vermont is that towns and school dis- 
tricts are coterminous. There are, however, numer- 
ous deviations from the norm. Some small towns do 
not operate elementary or secondary schools; the chil- 
dren from these towns are sent to public or even pri- 
vate schools in neighboring communities, with 
tuition payments going from the sending towns to 
the receiving schools." Many other towns do not have 
their own high schools, choosing to either “tuition 
out” their high school students or to participate in 
unit high school districts." Since one of the goals of 
this research was to quantify the impact of Act 60 on 
the inequality in services provided to the schoolchil- 
dren of Vermont, the school district had to be the 
fundamental unit of analysis. Thus, several decisions 
had to be made to ensure that what was presented 
was the most accurate picture of the impact of Act 60 
on the distributions of expenditures 
and student performance. 



The extensive school 



Data 



Sources 



finance reforms in 



The majority of the data that are ana- 
lyzed in this paper are drawn from the 
Vermont School Report and from pub- 
lications of the Vermont Department 
of Taxes. In addition, town-level data 
on school expenditures were drawn 
from Heaps and Woolf (2000) and 
from files created by the Vermont De- 
partment of Education and posted at 
http://www.state.vt.us/educ/new/html/ data/ 
perpupil.html . The Vermont Indicators Online database, 
which is maintained by the Center for Rural Studies at 
the University of Vermont, was the source of some pre- 
1999 information on income, demographics, and prop- 
erty wealth at the town level. Finally, the Common Core 
of Data, maintained by the National Center for Educa- 
tion Statistics, was the source of school-level data on the 
racial/ethnic composition of each school’s student body. 



California generated 
greater equality across 
school districts in per 
pupil spending but not 
greater equality in 
measured student 
performance. 



First, all towns that were not 
tuitioning out students at the elemen- 
tary level were matched to the school 
district serving elementary school stu- 
dents from that town. The same 
matching process was done for towns 
not tuitioning out high school stu- 
dents." Knowing the town-school 
district matches made it possible to 
create school-district-level versions of 
some variables that were only avail- 
able at the town level. Second, towns 
were grouped into types based on in- 
stitutional arrangements. This made it possible to ex- 
amine separately the impact of Act 60 on school 
districts linked to towns with different institutional 
arrangements. 

Nevertheless, the reality in Vermont is that school 
spending levels are voted on in town meetings, that 
state aid flows to towns and not school districts, and 
that analyses of the impact of Act 60 have tended to 



" In the 2001-02 school year, 824 equalized pupils (out of 103,347 equalized pupils in the state) resided in towns or other areas in which 
all students were “tuitioned out.”Another 87 equalized pupils resided in towns that did not operate an elementary school but belonged 
to a union high school district. 

In the 2001-02 school year, 15,274 equalized pupils resided in towns in which elementary students were served locally but high school 
students were tuitioned out. Of these 15,274 equalized pupils, less than half were tuitioned out. 

If a town tuitions out either elementary or high school students, those students could be attending school in several surrounding districts. 
As a result, the town cannot be matched to a single elementary or high school district. 



101 




Developments in School Finance: 2003 



focus on variation in expenditures across towns. So, 
even though cross-town variation provides an imper- 
fect indication of the variation across school districts 
(and, thus, across students) in expenditures, results 
are presented that use town-level data on expenditures. 
These results make it possible to compare the findings 
in this study to those in previous work. Further, the 
town-level data are such that it is possible to make a 
crude adjustment for the effect of institutional varia- 
tion on expenditures. In particular, the analysis in this 
paper will use two alternative measures of expendi- 
tures, one of which is explicitly designed to adjust for 
variation in institutional structure. 

Summary Statistics 

One of the advantages of examining the impact of fi- 
nance reforms in Vermont is the sta- 
bility of the student population 
served by Vermont schools. For ex- 
ample, in the 1995-96 school year 
3.12 percent of the students attend- 
ing public school in Vermont were 
identified as minority. This percent- 
age fluctuated slightly over the next 
4 academic years, from 2.73 percent 
in 1996-97 to 3.16 percent in 1999- 
2000. Clearly, in Vermont, unlike 
in California, the schools were not 
trying to adjust to a dramatically 
changing population at the same time 
they were coping with the effects of 
finance reforms. 

Other measures of the income and demographics of 
the Vermont student population were also relatively 
stable both immediately before and just after the imple- 
mentation of Act 60. For each school year from 1994— 
95 to 2000-01, table 1 provides summary statistics 
on certain key measures of the demographics and in- 
come of each school district in the state. Average ad- 
justed gross income per exemption, a rough proxy for 
per capita income, did increase throughout the pe- 
riod, an unsurprising result given that all dollar fig- 
ures in the table are nominal and that this was a period 
of strong economic expansion. What is more striking, 
however, is the stability across time of the poverty rate 



and the percent of students eligible for free or reduced- 
price school lunches. The observable characteristics of 
the population of students being served by Vermont 
schools appear to have changed little over time. 

This stability of measured attributes of the student 
population does not insure that there have been no 
significant changes in critical unmeasured characteris- 
tics of the students served by the public schools in 
Vermont. In other words, in an event analysis of the 
impact of Act 60 on the distribution of student per- 
formance, there will be no way to rule out the possi- 
bility that cross-time changes in the distribution are 
driven by cross-time changes in unobservables as op- 
posed to by the effects of the finance reforms. That 
said, the Vermont context still provides researchers 
with the best opportunity to date to estimate the ef- 
fects of finance reforms on the distri- 
bution of student performance. 

The remaining rows in table 1 pro- 
vide summary information on some 
of the expenditure and student per- 
formance measures available in the 
Vermont School Report. No obvious 
trends in performance are apparent 
in table 1. Some performance mea- 
sures improved after Act 60; others 
declined. For some of the measures 
of performance, dispersion fell after 
Act 60, but dispersion increased for 
other measures. Flowever, because the 
crude summary measures in table 1 
give no indication of how post-Act 60 changes are 
linked to a district’s pre-Act 60 status, no conclusions 
about the performance effects of Act 60 can be drawn 
on the basis of the evidence in table 1. 

Table 1 also does not support any firm conclusions 
about the extent to which the link between local wealth 
and spending has been weakened by Act 60. That said, 
even the summary measures in table 1 provide some 
indication of the impact of Act 60 on the dispersion 
in expenditures. The coefficient of variation of current 
expenditures per pupil increased from 3.61 in 1994- 
95 to 5.61 in 2001-02. While some of this increase 
predated Act 60, Act 60 has mattered, a fact that 



One of the advantages 
of examining the 
impact of finance 
reforms in Vermont is 
the stability of the 
student population. 



14 



Means across schools of the percent minority evidence the same stability. In 1995-96, the across-school mean was 2.1 percent. In 1999- 
2000, the across-school mean was 2.8 percent. 
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Table 1. Summary statistics — selected characteristics of Vermont school districts 


Pre-Act 60 


Post-Act 60 




1994-95 1995-96 1996-97 1997-98 1998-99 


1999-2000 2000-01 


2001-02 


Standard Standard Standard Standard Standard 


Standard Standard 


Standard 



Variable Mean deviation Mean deviation Mean deviation Mean deviation Mean deviation Mean deviation Mean deviation Mean deviation 

Current expenditures 



per pupil (in dollars) 


5572.94 


1523.49 


5773.13 


1512.51 


5935.14 


1012.70 


6175.11 


1105.13 


6652.33 


1294.00 


7601.18 


1330.29 


8262.36 


1491.91 


8888.53 


1583.64 


Special education costs 
per eligible pupil 
(in dollars) 


1208.36 


635.58 


1302.94 


719.15 


1363.33 


654.22 


1503.53 


710.10 


1693.59 


839.70 














Students per teacher 


17.96 


11.24 


25.52 


60.16 


17.28 


3.28 


16.86 


3.04 


16.57 


3.56 


15.80 


2.92 


15.18 


2.88 


15.03 


3.41 


Average teacher salary 
(in dollars) 


32464.85 


4237.02 


33530.49 


4487.78 


33948.62 


4495.41 








34898.05 


4889.94 


35455.79 


4881.99 


36130.65 


4861.57 


37229.33 


5192.18 


Students per computer 


— 


— 


— 


— 


8.03 


4.67 


8.04 


4.65 


7.35 


4.84 


6.81 


4.46 


— 


— 


— 


— 


Percent of students 
eligible for lunch subsidy 














27.66 


17.90 


29.20 


17.99 


29.42 


18.12 


29.07 


17.44 


28.35 


16.89 


29.01 


17.25 


Poverty rate 


12.29 


7.74 


12.44 


7.86 


10.91 


8.04 


11.59 


7.40 


10.74 


7.50 


11.81 


7.70 


10.63 


7.05 


9.73 


6.55 


Average adjusted gross 
income per exemption' 
(in dollars) 


13580.56 


2621.25 


14220.69 


2790.16 


14894.31 


3026.38 


15829.26 3194.75 


16913.95 


3401.20 


17736.01 


3500.82 


18597.44 


3781.75 


19605.62 


5359.29 


Percent of grade 2 
students at or above 
the standard on the 
NSRE^ in reading 














75.70 


13.62 


71.89 


15.34 


75.63 


14.60 


78.01 


14.25 


80.44 


12.20 


Percent of grade 4 
students at or above 
the standard on the 
NSRE in mathematics 
concepts 






17.97 


16.50 






32.47 


19.10 


37.94 


17.88 


38.14 


18.84 


42.06 


19.04 


45.65 


20.42 


Percent of grade 4 students 
at or above the standard 
on the NSRE in reading 










58.17 


17.89 


79.19 


13.22 


86.12 


11.00 


83.04 


12.70 


79.30 


13.26 


80.29 


12.08 


Percent of grade 8 students 
at or above the standard 
on the NSRE in 
mathematics concepts 






30.00 


14.53 






37.98 


18.16 


31.75 


16.56 


32.08 


15.65 


35.65 


18.76 


36.03 


17.80 


Percent of grade 8 students 
at or above the standard 
on the NSRE in reading 














73.03 


13.15 


61.38 


16.70 


63.02 


13.82 


58.62 


15.27 


63.25 


14.32 


64.89 


13.76 



— Not available. 



' From 1997-98 on, average adjusted gross income per exemption is available for all school districts (n = 248). In the remaining years, average adjusted gross income per 
exemption is only available for those districts that correspond directly to towns (n = 203). 

^NSRE is the New Standards Reference Exam. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 
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should be more evident when post-phase-in expendi- 
ture measures are analyzed. It is to these measures that 
we turn in the next section. 

Results 

The Distribution of Expenditures Before and 
After Act 60 

The starting point of any evaluation of the effects of 
Act 60 is the choice of a measure of per pupil expendi- 
tures. When towns have been used as the unit of analy- 
sis, two measures of spending have been used in the 
analyses of the extent of spending inequality and the 
effect of Act 60 on that inequality. Heaps and Woolf 
(2000) used budgeted expenditures per equalized pu- 
pil. However, because many towns 
send or receive students for whom tu- 
ition is being paid, inequality in bud- 
geted expenditures may overstate true 
spending inequality. For example, 
overstatement of inequality could re- 
sult because budgeted expenditures 
per equalized pupil are based on resi- 
dential pupil counts that do not in- 
clude tuitioning students, resulting 
in artificially high per pupil numbers 
for districts receiving tuition students. 

So, as an alternative, other analysts, 
like Jimerson (2001) and Baker 
(2001), use measures of spending 
based on the state’s calculation of lo- 
cal education spending per equalized 
pupil. Local education spending is that portion of a 
school district budget paid by the general state sup- 
port grant, local education tax revenues, and any aid 
from the sharing pool when applicable. Local educa- 
tion spending does not include federal aid or privately 
donated dollars. 

In what follows, both measures of spending are con- 
sidered since neither is a perfect indicator of the edu- 
cational opportunities available to students in a town. 
The argument for using budgeted expenditures is that 
this measure includes expenditures out of not only 
noncategorical state aid and property tax income but 



also expenditures out of such diverse income sources 
as categorical aid for special education'^ and income 
from the private donations to the schools. But because 
of the problems created by students for whom tuition 
payments are being made, local education spending 
per pupil must also be considered. 

At the school district level, the choice of expenditure 
measures is somewhat more clear-cut. Current expen- 
ditures per pupil measures noncapital spending; total 
expenditures per pupil includes current and capital 
spending. In the analysis that follows, both of these 
measures are examined. It is not possible, however, to 
examine the extent to which cost-adjusted spending 
has become more equal. Both before and after Act 60, 
the state aid formula recognized the fact that certain 
students are more costly to educate, basing aid 
amounts not on raw pupil counts but 
on equalized pupil counts. The equal- 
ized pupil count was determined by 
assigning weights of 1.25 to second- 
ary students and students receiving 
food stamps, assigning weights of 1.2 
to students with limited English pro- 
ficiency, and averaging the weighted 
counts across 2 school years (Mathis 
2001). Since these weights are ad hoc 
and other critical determinants of cost 
are not taken into account, the cost 
adjustments in the basic state aid for- 
mula will be imperfect (Downes and 
Pogue 1994). Categorical aid pro- 
grams, like a small schools grant pro- 
gram that was established by Act 60, may help to 
reduce inequality in cost-adjusted aid. Nevertheless, 
any inequality measures presented below undoubtedly 
understate the extent of inequality in cost-adjusted 
spending, since high-cost districts are typically low- 
spending districts. 

While the circumstances cited by the plaintiffs in 
Brigham v. State existed for many years, trends in 
spending inequality in the late 1980s and early 1990s 
undoubtedly contributed to the decision to fde suit. 
For example, from 1989-90 to 1994-95, current ex- 
penditures per pupil had grown at an annual rate of 



The starting point of 
any evaluation of the 
effects of Act 60 is the 
choice of a measure of 
per pupil expenditures. 
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Since categorical aid is fungible, increases in categorical aid do increase the opportunities even for those students toward whom the aid 
is not targeted. 



104 




School Finance Reform and School Quality: Lessons From Vermont 



'iFJl percent at the top of the range. The annual growth 
rate at the bottom of the range was only 1.9 percent. 
The Brigham decision was handed down when the dis- 
persion in expenditures was large and growing. 

Standard inequality measures like the coefficient of 
variation and the Gini coefficient can both reflect the 
tail end of these trends and can provide an initial in- 
dication of the impact of Act 60 on spending inequal- 
ity.'^ And when the town level measures of spending 
are used, the initial indication is that spending has 
become more equal post-Act 60. In particular, for bud- 
geted expenditures in 1998-99, the coefficient of varia- 
tion was 0.1317 and the Gini coefficient was 0.0728. 
In 2000-01, the coefficient of variation was 0.1158 
and the Gini coefficient was 0.0652. These measures 
of inequality increased after 2000-01; in 2002-03 
they equaled 0.1249 for the coeffi- 
cient of variation and 0.0699 for the 
Gini coefficient. But both measures 
were still below their 1998-99 lev- 
els. And since Act 60 was already be- 
ing phased-in in 1998-99, these 
numbers probably understate the ex- 
tent to which inequality in education 
spending by towns has declined after 
the implementation of Act 60. 

Inequality measures at the school dis- 
trict level tell much the same story 
as town level inequality measures. 

For example, for those school dis- 
tricts serving students in grades K- 
12, the coefficient of variation of current expenditures 
per pupil was 0.1280 in 1995-96. For these dis- 
tricts, the coefficient of variation increased to 0.1358 
in 1996-97 and 0.1380 in 1997-98, the last pre- 
Act 60 year. In the post-Act 60 period, the coeffi- 
cient of variation for current expenditures per pupil 



in these school districts generally declined — falling 
to 0.1329 in 1998-99, 0.1252 in 1999-2000, and 
0.1144 in 2000—01 — before increasing to 0.1339 
in 2002-03.'^ 

For other types of school districts, the inequality mea- 
sures tend to tell the same story: fluctuating inequal- 
ity pre-Act 60 and reduced inequality post-Act 60. 
The one exception occurs for those elementary school 
districts located in towns that belong to union or joint 
high school districts.'® For these districts, the coeffi- 
cients of variation in current expenditures per pupil 
were 0.4083 in 1995-96, 0.1579 in 1996-97, 0.1676 
in 1997-98, 0.1834 in 1998-99, 0.1938 in 1999- 
2000, 0.1974 in 2000-01, and 0.1956 in 2001-02.'" 
The absence of any decline in inequality in spending 
in the post-Act 60 period may be attributable to the 
ability of some of these districts to 
circumvent Act 60. 

The stability in inequality in elemen- 
tary school districts located in towns 
that belong to union or joint high 
school districts does not, by itself, 
indicate that goals of Act 60 have not 
been accomplished, since dispersion 
of expenditures does not imply un- 
equal opportunities attributable to 
differences in taxable wealth, a real- 
ity that was recognized by the Ver- 
mont Supreme Gourt. For instance, 
dispersion in current expenditures per 
pupil could exist and be unrelated to 
property wealth if the state targeted categorical aid to 
districts with a greater proportion of disadvantaged 
students. What equalization of educational opportu- 
nities does require is elimination of the positive corre- 
lation between expenditures and taxable wealth. That 
this is the case is made clear in the Brigham decision: 



What equalization of 
educational opportuni- 
ties does require is 
elimination of the 
positive correlation 
between expenditures 
and taxable wealth. 



Expenditures were weighted by enrollment in the calculation of the inequality measures. See Murray, Evans, and Schwab (1998) for 
further discussion of the need for weighing by enrollment. 

The pattern of Gini coefficients for current expenditures per pupil in these districts is very similar. The values of the Gini coefficients were 
0.0732 in 1995-96. 0.0777 in 1996-97, 0.0810 in 1997-98, 0.0778 in 1998-99, 0.0727 in 1999-2000, 0.0645 in 2000-01, and 0.769 
in 2002-03. 

In addition to K-12 districts and elementary districts located in towns that belong to union or joint high school districts, the other large 
group of districts is elementary districts located in towns that tuition out their high school students. 

‘ ^ Again, the pattern of Gini coefficients for current expenditures per pupil in these districts is very similar. The values of the Gini coefficients 
were 0.1664 in 1995-96, 0.0889 in 1996-97, 0.0929 in 1997-98, 0.0983 in 1998-99, 0.1005 in 1999-2000, 0.1019 in 2000-01, and 
0.1042 in 2001-02. 
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Equal educational opportunity cannot be 
achieved when property-rich school districts 
may tax low and property-poor districts must 
tax high to achieve even minimum standards. 
Children who live in property-poor districts 
and children who live in property-rich districts 
should be afforded a substantially equal op- 
portunity to have access to similar educational 
revenues. (Brigham v. State, 166 Vt. at 268) 

Simple inequality measures do not tell us the extent 
to which Act 60 has produced a system of school 
financing in which the correlation between spend- 
ing and wealth has been reduced. Thus, following 
the logic of Downes (1992), simple ordinary least 
squares regressions of the spending measures on mea- 
sures of local resources were used to determine the 
extent to which Act 60 has reduced 
this correlation. For towns, the re- 
sults of these regressions are pre- 
sented in tables 2A and 2B. 

For the 246 towns in Vermont for 
which the relevant data are available, 
the first part of table 2A indicates that 
the correlation between budgeted ex- 
penditures per equalized pupil and 
equalized assessed valuation per pu- 
piP° was .515,^' clear evidence that 
districts with more real property 
wealth did have higher per pupil ex- 
penditures prior to Act 60. Since Act 
60 was already being phased-in in 
1998-99, this correlation probably understates the ac- 
tual strength of the relationship between expenditures 
and property wealth prior to the Brigham decision. 

The remainder of the first column of table 2A shows 
that, while the extent of inequality in educational op- 
portunities varies across potential measures of taxable 
resources, the conclusion that opportunities were un- 
equal does not depend on the measure of taxable re- 



sources used. For example, if permanent income is 
taken as the measure of taxable resources and median 
family income is used to proxy for permanent income, 
the correlation between budgeted expenditures per 
equalized pupil and taxable resources is .295, much 
less than the correlation between budgeted expendi- 
tures and equalized assessed valuation but still strong. 

As the discussion of table 1 indicated, after Act 60 
dispersion in expenditures was reduced, even in the 
phase-in years. Nevertheless, dispersion remained. But 
the Brigham decision did not require equalization of 
expenditures; the decision required the ability to fund 
public education to be independent of (or negatively 
correlated with) taxable wealth. The second column 
of table 2A and both columns of table 2B provide the 
evidence needed to determine if Act 60 has resulted in 
an education financing system that 
satisfies the Brigham decision. From 
1998-99 to 2002-03, the correla- 
tion between equalized assessed valu- 
ation per pupil and budgeted 
expenditures per equalized pupil fell 
from .516 to .077 and, in the latter 
year, was insignificant at the 10 per- 
cent level. Similar weakening in the 
relationship between this expenditure 
measure and other measures of tax- 
able resources can be seen in table 2A. 
Further, in table 2B, which gives only 
post-Act 60 correlations between tax- 
able resource measures and local ex- 
penditures per equalized pupil, the 
estimated relationship between equalized assessed valu- 
ation per pupil and local expenditures per equalized 
pupil is actually negative. Median family income con- 
tinues to be positively related to local expenditures 
per equalized pupil, though this relationship does ap- 
pear to be weakening over time. 

In combination with the evidence on the simple dis- 
tributions of expenditures, these results support the 



Equal educational 
opportunity cannot 
be achieved when 
property-rich school 
districts may tax low 
and property-poor 
districts must tax high 
to achieve even mini- 
mum standards. 



Because of data limitations, equalized assessed valuation can only be calculated for 1998-99. The 1998-99 values are used throughout 
this analysis. While pre-Act 60 measures of property wealth would probably be preferable, Act 60-induced changes in property values 
were unlikely to be apparent in 1998-99, the first year of the phase-in of Act 60. 

It is not possible to separate capital expenditures out of this measure of per pupil expenditures. No clear indication exists as to whether 
the correlation of this expenditure measure with assessed valuation overstates or understates the correlation of current expenditures with 
assessed valuation. Thus, some caution must be exercised in interpreting these correlations. 



106 




School Finance Reform and School Quality: Lessons From Vermont 



Table 2A. Relationships between expenditures and wealth measures for Vermont towns: 1998-2003 

[Dependent variable: Budgeted expenditures per equalized pupil] 


Variable 


1998-99 


2002-03 


Parti 


Intercept 

Equalized assessed valuation per pupil 
Correlation coefficient 


7099.289 (125.227) 

0.00093 (0.00027) 

0.266 

0.515 

Part 2 


9587.354 (103.488) 
0.00011 (0.00012) 
0.006 
0.077 


Intercept 

Median family income in 1 989 
Correlation coefficient 


5932.714 (357.866) 

0.04901 (0.01110) 

0.0873 

0.295 

Parts 


8373.432 (396.410) 
0.03906 (0.01147) 
0.0413 
0.203 


Intercept 

Equalized assessed valuation per pupil 
Median family income in 1 989 
R^ 

Correlation coefficient 


5786.738 (298.144) 
0.00088 (0.00024) 
0.04086 (0.00910) 
0.325 
0.570 


8360.895 (398.805) 
0.00007 (0.00008) 
0.03821 (0.01149) 
0.0439 
0.210 


NOTE: Robust standard errors in parentheses. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files; Heaps 
and Woolf (2000); Vermont Indicators Online database. 



Table 2B. Relationships between expenditures and wealth measures for Vermont towns: 2000-03 

[Dependent variable: Local expenditures per equalized pupil] 


Variable 


2000-01 




2002-03 


Parti 


Intercept 

Equalized assessed valuation per pupil 
R^ 

Correlation coefficient 


6980.417 (75.355) 
-0.00036 (0.0001 1) 
0.0613 
0.248 


Part 2 


7918.069 (85.572) 
-0.00028 (0.00010) 
0.0403 
0.201 


Intercept 

Median family income in 1 989 
R^ 

Correlation coefficient 


5419.094 (303.555) 
0.04200 (0.00941) 
0.0862 
0.294 


Parts 


6439.233 (400.806) 
0.03927 (0.01226) 
0.0509 
0.226 


Intercept 

Equalized assessed valuation per pupil 
Median family income in 1 989 
R^ 

Correlation coefficient 


5477.803 (278.514) 
-0.00042 (0.00015) 
0.04658 (0.00848) 
0.172 
0.415 




6501.401 (376.359) 
-0.00033 (0.00013) 
0.04345 (0.01139) 
0.127 
0.356 


NOTE: Robust standard errors in parentheses. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files; Heaps 
and Woolf (2000); Vermont Indicators Online database. 
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view that a good faith effort has been made to satisfy 
the Brigham decision. While the correlation between 
taxable resources and the two expenditure measures 
considered here has not been reduced to zero, educa- 
tional opportunities were more equal in 2002-03 than 
in 1998-99.^^ 



joint high school districts. These results confirm the 
picture created by the simple inequality measures; 
Act 60 has had less of an impact on inequality in 
educational opportunities in elementary school dis- 
tricts located in towns that belong to union or joint 
high school districts. 



When we turn to school districts as the unit of analy- 
sis, the results do not provide quite as unequivocal pic- 
ture of the impact of Act 60 on the correlation between 
wealth measures and per pupil spending. In tables 3A 
and 3B, the results of regressions like those that gen- 
erated the results in tables 2A and 2B are reported for 
the case when K-12 districts are the unit of analysis. 
In tables 4A and 4B, elementary school districts lo- 
cated in towns that belong to union or joint high school 
districts are the unit of analysis. 



Student Performance Before and After Act 60 

As the discussion in “School Finance Reform in Ver- 
mont,” above, indicates, the Brigham decision focused 
on spending inequities. Further, the goals of Act 60 
were to reduce spending inequities and to provide prop- 
erty tax relief Nevertheless, the justices of the Ver- 
mont Supreme Court made clear in their decision that 
in their view inequities in expenditures were likely to 
translate into inequities in outcome: 



When current expenditures per pu- 
pil is used as the spending measure, 
the correlations between spending 
and wealth decline for each wealth 
measure both for K-12 districts 
and for elementary school districts 
located in towns that belong to 
union or joint high school districts. 

If, however, total expenditures per 
pupil is used as the spending mea- 
sure, there is not consistent evidence 
of a weakening in the relationship 
between spending and wealth. For 
each of the wealth measures, there 
is evidence of a decline in the cor- 
relation between equalized assessed value per pupil 
and wealth, when K-12 districts are the unit of 
analysis. However, for each of the three wealth mea- 
sures, the correlation between total expenditures per 
pupil and wealth has increased for elementary school 
districts located in towns that belong to union or 



Money is clearly not 
the only variable 
affecting educational 
opportunity, but it is 
one that government 
can effectively equalize. 



While we recognize that equal 
dollar resources do not necessar- 
ily translate equally in effect, 
there is no reasonable doubt that 
substantial funding differences 
significantly affect opportunities 
to learn. To be sure, some school 
districts may manage their 
money better than others, and 
circumstances extraneous to the 
educational system may substan- 
tially affect a child’s performance. 
Money is clearly not the only 
variable affecting educational op- 
portunity, but it is one that gov- 
ernment can effectively equalize. (Brigham v. 
State, 166 Vt. at 255-56) 

Thus, consideration must be given to how the distri- 
bution across districts of student performance changed 
after Act 60. 



Given the available data, it was not possible to quantify directly the strength of the correlation between expenditures and wealth measures 
prior to implementation of Act 60. However, the results of Baker (2001) provide an indirect indication of the strength of the correlation. 
In regressions that are analogous to those in part 3 of table 2B, Baker generates R^s ranging from 0.47 to 0.51 for the school years from 
1994-95 to 1998-99. Further, the highest occurs in 1998-99, the first year of the Act 60 phase-in. The implication, then, is that the 
correlation between expenditures and the wealth measures considered in this paper was probably strong and stable in the years leading 
up to Act 60. 

All of these regressions have been estimated in log-log form with contemporaneous measures of per pupil equalized assessed value and 
with adjusted gross income per exemption replacing the lagged measures used in tables 3A, 3B, 4A, and 4B. The implications of the results 
that are generated from these alternative specifications are the same as those reported here. 
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Table 3A. Relationships between expenditures and wealth measures for Vermont K-12 districts: 
1997-2002 

[Dependent variable: Current expenditures per equalized pupil] 



Variable 


1997-98 




2001-02 


Parti 


Intercept 

Equalized assessed valuation per pupil in 1 999 
Correlation coefficient 


5593.263 (232.264) 
0.00190 (0.00059) 
0.182 
0.426 


Part 2 


8373.759 (421.578) 
0.00179 (0.00133) 
0.087 
0.295 


Intercept 

Adjusted gross income per exemption in 1 995 
Correlation coefficient 


4539.397 (779.732) 
0.11919 (0.05448) 
0.155 
0.394 


Parts 


7606.276 (840.734) 
0.09352 (0.05480) 
0.050 
0.223 


Intercept 

Equalized assessed valuation per pupil in 1 999 
Adjusted gross income per exemption in 1 995 
R^ 

Correlation coefficient 


5047.538 (940.288) 
0.00131 (0.00085) 
0.05392 (0.07972) 
0.209 
0.457 




8219.601 (1124.209) 
0.00157 (0.00197) 
0.01490 (0.10701) 
0.091 
0.302 



NOTE: Robust standard errors in parentheses. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 



Table 3B. Relationships between expenditures and wealth measures for Vermont K-12 districts: 
1997-2002 


[Dependent variable: Total expenditures per equalized pupil] 




Variable 


1997-98 




2001-02 


Parti 


Intercept 

Equalized assessed valuation per pupil in 1 999 
R^ 

Correlation coefficient 


6917.383 (483.383) 
0.00122 (0.00087) 
0.021 
0.145 


Part 2 


9673.285 (608.913) 
0.00128 (0.00169) 
0.021 
0.145 


Intercept 

Adjusted gross income per exemption in 1 995 
R^ 

Correlation coefficient 


5464.105 (1577.796) 
0.12711 (0.10495) 
0.048 
0.219 


Parts 


8967.333 (906.447) 
0.07108 (0.05652) 
0.012 
0.112 


Intercept 

Equalized assessed valuation per pupil in 1 999 
Adjusted gross income per exemption in 1 995 
R^ 

Correlation coefficient 


5448.176 (1867.612) 
-0.00004 (0.00132) 
0.12915 (0.14738) 
0.048 
0.218 




9457.480 (1263.456) 
0.00125 (0.00270) 
0.08242 (0.13496) 
0.024 
0.155 


NOTE: Robust standard errors in parentheses. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 
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Table 4A. Relationships between expenditures and wealth measures for Vermont elementary 
districts located in towns that do not tuition out high school students: 1997-2002 

[Dependent variable: Current expenditures per equalized pupil] 


Variable 


1997-98 




2001-02 


Parti 


Intercept 

Equalized assessed valuation per pupil in 1 999 
Correlation coefficient 


5512.449 (170.779) 
0.00092 (0.00023) 
0.170 
0.412 


Part 2 


7986.672 (237.922) 
0.00134 (0.00028) 
0.157 
0.396 


Intercept 

Adjusted gross income per exemption in 1995 
Correlation coefficient 


51 16.1 17 (505.333) 
0.07212 (0.03679) 
0.032 
0.179 


Parts 


8018.435 (829.521) 
0.06213 (0.05965) 
0.010 
0.099 


Intercept 

Equalized assessed valuation per pupil in 1 999 
Adjusted gross income per exemption in 1 995 
R^ 

Correlation coefficient 


5199.331 (481.227) 
0.00083 (0.00023) 
0.02712 (0.03741) 
0.162 
0.402 




8152.715 (756.184) 
0.00134 (0.00030) 
-0.01048 (0.05583) 
0.151 
0.388 


NOTE: Robust standard errors in parentheses. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 



Table 4B. Relationships between expenditures and wealth measures for Vermont elementary 
districts located in towns that do not tuition out high school students: 1997-2002 

[Dependent variable: Total expenditures per equalized pupil] 



Variable 


1997-98 




2001-02 


Parti 


Intercept 

Equalized assessed valuation per pupil in 1 999 
R^ 

Correlation coefficient 


6467.415 (230.387) 
0.00091 (0.00022) 
0.065 
0.254 


Part 2 


8859.059 (273.432) 
0.00147 (0.00028) 
0.118 
0.344 


Intercept 

Adjusted gross income per exemption in 1995 
R^ 

Correlation coefficient 


5635.405 (764.984) 
0.10323 (0.05418) 
0.024 
0.155 


Parts 


8026.746 (1025.426) 
0.13226 (0.07569) 
0.028 
0.169 


Intercept 

Equalized assessed valuation per pupil in 1 999 
Adjusted gross income per exemption in 1995 
R^ 

Correlation coefficient 


5717.045 (737.757) 
0.00081 (0.00024) 
0.05908 (0.05555) 
0.071 
0.266 




8164.198 (946.995) 
0.00137 (0.00031) 
0.05793 (0.07331) 
0.122 
0.349 



NOTE: Robust standard errors in parentheses. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 
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A crude indication of the impact of Act 60 on student 
performance is given by table 5, which presents corre- 
lations in 1995-96 and 2001-02 between some of 
the district characteristics summarized in table 1. The 
correlations between student performance and all avail- 
able measures of the resources allocated towards edu- 
cation have weakened in the post-Act 60 period. The 
starkest example of the weakening of these relation- 
ships is the decline from 1995—96 to 2001—02 of the 
relationship between current expenditures per pupil 
and the percent of eighth-graders at or above the stan- 
dard for the concepts portion of the New Standards 
Reference Exam (NSRE) in mathematics.^^ A more 
systematic assessment of the impact of Act 60 can be 
based on the results in table 6, which gives a few typi- 
cal event-analysis regressions that are similar in flavor 
to those in Downes and Figlio (2000).^^ 

Because they include controls for 
district-specific effects and because 
they are based on a functional form 
that explicitly accounts for the real- 
ity that the share of students meet- 
ing the standard must range between 
0 and 1, regressions like those in table 
6 provide the most convincing esti- 
mates of the impact of Act 60. Fur- 
ther, because in these regressions the 
impact of Act 60 is allowed to vary 
with pre-Act 60 spending levels or 
pre-Act 60 property wealth, the re- 
gressions provide a direct indication 
of the extent to which the link be- 
tween wealth and performance has changed post-Act 
60. And, what is apparent from these regressions, and 
from a number of regressions in which other outcome 
measures are used as the dependent variable, is that 
there is some evidence that the gaps in performance 
between high-spending and low-spending districts and 
between high-wealth and low-wealth districts have, 
ceteris paribus, declined post-Act 60. In these regres- 
sions, the coefficient on the interaction between the 
Act 60 dummy and the pre-Act 60 spending or pre- 



Act 60 wealth is never positive and significant. And, 
as can be seen in table 6, these coefficients are fre- 
quently negative and significant. 

Care must be taken, however, not to make too much 
of the declines in inequality. The coefficients on the 
interactions are not consistently negative and signifi- 
cant. Further, when these coefficients are significant, 
they are quantitatively small. For example, the coeffi- 
cient in the first column of table 6 implies that, ceteris 
paribus, the difference between the shares of fourth- 
graders at or above the standard for a school district 
with spending one standard deviation below the mean 
in 1994-95 and a school district with spending one 
standard deviation above the mean in 1994-95 would 
decline by 0.0021 if each district had the mean num- 
ber of test takers in 2000-01. It seems unlikely that 
such small declines in dispersion in 
performance justify a major policy in- 
tervention like Act 60. 

Concluding Remarks 

Act 60 represents a dramatic change 
in the system of education financing 
in a state with a history of a demo- 
graphically stable student population. 
As a result. Act 60 may well provide 
an unparalleled opportunity to assess 
the impact of a significant finance 
reform on a state’s education system. 
This paper represents a first cut at 
just such an assessment. 

All of the evidence cited in this paper supports the 
conclusion that Act 60 has dramatically reduced dis- 
persion in education spending and has done this by 
weakening the link between spending and property 
wealth. Further, the regressions presented in this pa- 
per offer some evidence that student performance has 
become more equal in the post-Act 60 period. And 
no results support the conclusion that Act 60 has con- 
tributed to increased dispersion in performance. 



Act 60 has dramatically 
reduced dispersion in 
education spending by 
weakening the link 
between spending and 
property wealth. 



Jimerson (2001) observes a similar decline in the correlation between equalized assessed value per pupil and the percent of fourth-graders 
at or above the standard for the NSRE. 

A traditional event-analysis approach is preferable to the production function approach used by Flanagan and Murray (2002) because the 
production function approach can only provide accurate estimates of the impact of the policy changes if all of the effects of the changes 
work through changes in measured inputs and if all of the changes in inputs are attributable to the policy changes. 
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Table 5. Correlations between selected characteristics for Vermont school districts, 1995 


-96 and 


2001-02 
































Percent of 


Percent of 
















grade 4 


grades 












Average 




students at 


students at 












adjusted 


Percent of 


or above the 


or above the 






Students 






gross income 


adults with 


standard on 


standard on 




Current 


per 


Average 




per exemption 


college 


the NSRE' in 


the NSRE' in 




expenditures 


classroom 


teacher 


Poverty 


(from tax 


degree 


mathematics 


mathematics 


Variable 


per pupil 


teacher 


salary 


rate 


returns) 


(1990) 


concepts 


concepts 












1995-96 








Current expenditures 
per pupil 
Students per 


1 .0000 
















classroom teacher 


-0.051 1 


1 .0000 














Average teacher salary 


0.2656 


-0.2085 


1 .0000 












Poverty rate 
Average adjusted gross 


-0.1021 


-0.0102 


-0.1504 


1 .0000 










income per exemption 
(from tax returns) 


0.2286 


-0.0522 


0.5196 


-0.5655 


1 .0000 








Percent of adults with 
college degree (1990) 
Percent of grade 4 


0.1682 


-0.0081 


0.3916 


-0.5683 


0.7964 


1 .0000 






students at or above 
the standard on the 
NSRE' in mathematics 
concepts 


0.0827 


0.0244 


0.0731 


-0.1993 


0.1966 


0.2135 


1 .0000 




Percent of grade 8 


















students at or above 
the standard on the 
NSRE' in mathematics 
concepts 


0.2115 


-0.0461 


0.2283 


-0.3585 


0.3195 


0.3812 


0.1876 


1 .0000 












2001-02 








Current expenditures 
per pupil 

Students per classroom 


1 .0000 
















teacher 


-0.3108 


1 .0000 














Average teacher salary 


0.0508 


0.3572 


1.0000 












Poverty rate 
Average adjusted gross 


-0.0541 


-0.0955 


-0.2405 


1 .0000 










income per exemption 
(from tax returns) 


0.1182 


0.2716 


0.5416 


-0.5326 


1 .0000 








Percent of adults with 
college degree (1990) 
Percent of grade 4 students 


0.2359 


0.1540 


0.4235 


-0.5772 


0.7901 


1 .0000 






at or above the standard on 
the NSRE' in mathematics 
concepts 


0.0931 


0.1022 


0.1623 


-0.3493 


0.3077 


0.3884 


1 .0000 




Percent of grade 8 students 


















at or above the standard on 
the NSRE' in mathematics 
concepts 


0.1885 


0.1614 


0.3146 


-0.3562 


0.4936 


0.5084 


0.3378 


1 .0000 


' NSRE is the New Standards Reference Exam. 














SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 
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Table 6. Generalized Linear Model estimates of impact of Act 60 on student performance — fixed 
effects estimates^ 1995-2002 

[Dependent variable: Number of test-takers at or above 
the standard in mathematics concepts on the NSRE] 




Fourth-Graders 


Eighth-Graders 


Variable 


Specification 1 


Specification 2 


Specification 1 


Specification 2 


Dummy variable indicating 
post-Act 60 


-12.6896 (0.9788) 


0.3965 (0.3103) 


-12.3158 (1.1338) 


-1.7567 (0.3484) 


Interaction of post-Act 60 
dummy with per pupil 
equalized assessed 
valuation — 1998 




-0.00000001 (0.00000012) 




0.00000013 (0.00000008) 


Interaction of post-Act 60 
dummy and current 
expenditures per pupil — 
1995 


-0.00003 (0.00001) 




-0.00002 (0.00002) 




Poverty rate 


0.0046 (0.0143) 


0.0059 (0.0145) 


0.0029 (0.0163) 


0.0065 (0.0199) 


Adjusted gross income 
per exemption 


-0.00001 (0.00001) 


-0.00001 (0.00001) 


0.00003 (0.00002) 


0.00003 (0.00002) 


Dummy variable indicating 
1995-96 school year 


-14.0105 (1.2344) 


-0.7150 (0.2912) 


-12.3880 (1.1230) 


-1.7779 (0.3189) 


Dummy variable indicating 
1997-98 school year 


-13.1792 (1.0415) 


0.0874 (0.3076) 


-12.1258 (1.1304) 


-1.5372 (0.3351) 


Dummy variable indicating 
1 999-2000 school year 


-0.0108 (0.0463) 


-0.0182 (0.0463) 


-0.0078 (0.0547) 


-0.0018 (0.0543) 


Dummy variable indicating 
2000-01 school year 


0.1823 (0.0563) 


0.1827 (0.0554) 


0.1672 (0.0755) 


0.1719 (0.0748) 


Dummy variable indicating 
2001-02 school year 


0.3295 (0.0840) 


0.3285 (0.0830) 


0.2321 (0.1002) 


0.2364 (0.0987) 


Log of likelihood function 


-3109.6629 


-3159.0275 


1923.2455 


-1955.8046 


Number of observations 


1182 


1190 


690 


701 


— Not available. 

'The constant is omitted from each specification. The omitted school year is 1998-99. 

NOTE: Asymptotic standard errors robust to heteroskedasticity and within group correlation in parentheses. NSRE is the New 
Standards Reference Exam. 

SOURCE: Vermont School Report; publications of the Vermont Department of Taxes; Vermont Department of Education files. 



By themselves, these results may provide useful informa- 
tion for policymakers contemplating Act 60— style reforms. 
But the value of these results may well increase dramati- 
cally when taken together with the results of Duncombe 
and Johnston (2002) and of Flanagan and Murray (2002). 
What is striking is the similarity across studies in the 
estimated achievement effects. Pre-finance reform data 
on student test scores are not available to Duncombe 
and Johnston; they find no evidence that a diminish- 
ment in the dispersion in performance is apparent when 
examining post-fmance-reform test scores. They also 
document some recent relative improvement in dropout 



rates in high poverty districts, though they also find in- 
creased dispersion in dropout rates when comparing pre- 
and post-finance-reform data. 

The bottom line of Duncombe and Johnston’s analysis 
of dropout rates is that reform has resulted in small 
relative improvements. Flanagan and Murray reach con- 
clusions similar to those reached in this paper — post- 
reform dispersion in schooling outcomes has declined, 
but this decline in dispersion has been small. The re- 
sults presented above indicate that, in Vermont, there 
have been, at most, small relative improvements in the 
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test performance of fourth- and eighth-graders in those 
school districts with lower pre-reform per pupil spend- 
ing and per pupil property wealth. Flanagan and Murray 
find that relative increases in post-reform spending were 
translated into relative gains in post-reform test perfor- 
mance, but these gains were quantitatively small. Some- 
what surprisingly, then, the results of these new case 



studies tend to echo the results of the earlier work on 
California. Thus far, the case studies have confirmed a 
conclusion that was reached by many of the researchers 
who executed national-level analyses: the types of fi- 
nance reforms that have been implemented in response 
to court orders appear to have little, if any, impact on 
the distribution of student test performance. 



114 



School Finance Reform and School Quality: Lessons From Vermont 



References 

Baker, B.D. (2001). Balancing Equity for Students and Taxpayers: Evaluating School Finance Reform in 
Vermont. Journal of Education Finance, 2(j(Spring): 239-248. 

Benabou, R. (1996). Equity and Efficiency in Human Capital Investment: The Local Connection. Review of 
Economic Studies, C3(April): 237-264. 

Brigham v. State, 166 Vt. 246, available at http://dol.state.vt.us/gopher rootl/supct/l66/96-502.op 

Burkett, E. (1998, April 26). Don’t Tread on My Tax Rate. The New York Times Magazine, 42-45. 

Card, D., and Payne, A.A. (2002). School Finance Reform, the Distribution of School Spending, and the 
Distribution of Student Test Scores. Journal of Public Economics, iS3(January): 49-82. 

Courant, P.N., and Loeb, S. (1997). Centralization of School Finance in Michigan. Journal of Policy Analysis 
and Management, iC(Winter): 114-136. 

Downes, TA. (1992). Evaluating the Impact of School Finance Reform on the Provision of Public Educa- 
tion: The California Case. National Tax Journal, 45(December): 405-419. 

Downes, T.A., and Figlio, D.N. (1999). What Are the Effects of School Finance Reforms? Estimates of the Impact 
of Equalization on Students and on Affected Communities. Unpublished manuscript. Tufts University, 
Medford, MA. 

Downes, T.A., and Figlio, D.N. (2000). School Finance Reforms, Tax Limits, and Student Performance: Do 
Reforms Level-Up or Dumb Down? Unpublished manuscript. Tufts University, Medford, MA. 

Downes, T.A., and Pogue, TE (1994). Accounting for Fiscal Capacity and Need in the Design of School Aid 
Formulas. In J.E. Anderson (Ed.), Fiscal Equalization for State and Local Government Finance. New York: 
Praeger. 

Downes, T.A., and Shah, M. (1995, June). The Effect of School Finance Reform on the Level and Growth of Per 
Pupil Expenditures (Tufts University Working Paper No. 95-94). Medford, MA: Tufts University. 

Duncombe, W, and Johnston, J.M. (2002). Is Something Better Than Nothing? An Assessment of School 
Finance Reform in Kansas (Working Paper). Syracuse, NY: Center for Policy Research, Maxwell School of 
Citizenship and Public Affairs, Syracuse University. 

Evans, W.N., Murray, S., and Schwab, R.M. (1997). Schoolhouses, Courthouses, and Statehouses after 
Serrano. Journal of Policy Analysis and Management i6(Winter): 10-31. 

Evans, W.N., Murray, S., and Schwab, R.M. (1999). The Impact of Court-Mandated School Finance 

Reform. In H.E Ladd, R. Chalk, and J.S. Hansen (Eds.), Equity and Adequacy in Education Finance: Issues 
and Perspectives. Washington, DC: The National Academies Press. 

Fernandez, R., and Rogerson, R. (1997). Education Finance Reform: A Dynamic Perspective. Journal of 
Policy Analysis and Management iC(Winter): 67-84. 

Fernandez, R., and Rogerson, R. (1998). Public Education and Income Distribution: A Dynamic Qualita- 
tive Evaluation of Education Finance Reform. American Economic Review iS(S(September): 813-833. 

Flanagan, A., and Murray, S. (2002). A Decade of Reform: The Impact of School Reform in Kentucky. Unpub- 
lished manuscript, RAND, Washington, DC. 

Heaps, R., and Woolf, A. (2000, September). Evaluation of Act 60: Vermont’s Education Financing Law. 
Unpublished manuscript. Northern Economic Consulting, Inc., Burlington, VT. 



115 



Developments in School Finance: 2003 



Hoxby, C.M. (2001). All School Finance Equalizations Are Not Created Equal: Marginal Tax Rates Matter. 
The Quarterly Journal of Economics, iiC(November): 1189-1231. 

Flusted, T.A., and Kenny, L.W. (2000). Evidence on the Impact of State Government on Primary and 
Secondary Education and the Equity-Efficiency Trade-Off Journal of Law and Economics, 43{\): 285- 
308. 

Jimerson, L. (2001, February). A Reasonably Equal Share: Educational Equity in Vermont: A Status Report — 
Year 2000—2001 (Report of the Rural School and Community Trust Policy Program). Washington, DC: 
Rural School and Community Trust. 

Manwaring, R.L., and Sheffrin, S.M. (1997). Litigation, School Finance Reform, and Aggregate Educational 
Spending. International Tax and Public Einance, 4(2): 107-27. 

Mathis, WJ. (1995). Vermont. In S.D. Gold, D.M. Smith, and S.B. Lawton (Eds.), Public School Einance 
Programs of the United States and Canada, 1993—94. Albany, NY: The Nelson Rockefeller Institute of 
Government, State University of New York. 

Mathis, WJ. (1998, July). Act 60 and Proposition 13. Montpelier, VT: Concerned Vermonters for Equal 
Educational Opportunity. Available at http://www.act60works.org/opedl.html 

Mathis, WJ. (2001). Vermont. In C.C. Sielke, J. Dayton, C.T Ffolmes, and A.L. Jefferson (Comps.), Public 
School Einance Programs of the U.S. and Canada: 1998—99 (NCES 2001-309). U.S. Department of 
Education. Washington, DC: National Center for Education Statistics. 

McClaughry, J. (1997, December). Educational Einancing Lessons Erom California (Ethan Allen Institute 
Commentary). Concord, VT: Ethan Allen Institute. Available at http : / / www. e thanallen . org/ commentaries/ 
1997/ educatingfmancial.html 

McClaughry, J. (2001, January). Schoolchildren Eirst: Replacing Act 60 (Ethan Allen Institute Policy Brief 
007). Concord, VT: Ethan Allen Institute. Available at http://www.ethanallen.org.html 

Murray, S.E., Evans, W.N., and Schwab, R.M. (1998). Education Finance Reform and the Distribution of 
Education Resources. American Economic Review, iSiS(4):789-812. 

Nechyba, TJ. (1996, June). Public School Einance in a General Equilibrium Tiebout World: Equalization 
Programs, Peer Effects and Private School Vouchers (NBER Working Paper No. w5642). Cambridge, MA: 
National Bureau of Economic Research. 

Nechyba, TJ. (2000). Mobility, Targeting and Private School Vouchers. American Economic Review, 90{\)\ 
130-46. 

Silva, E, and Sonstelie, J. (1995). Did Serrano Cause a Decline in School Spending? National Tax Journal, 
48{iy. 199-215. 



116 



Shopping for Evidence Against 
School Accountability 

Margaret E. Raymond 
Eric A. Hanushek 
Stanford University 



About the Authors 

Margaret E. Raymond is director of CREDO at the 
Hoover Institution of Stanford University. CREDO 
provides impartial evaluation of educational programs 
and policies. She has published independent evalua- 
tions of Teach for America, charter schools, and 
California’s accountability system. She can be con- 
tacted at macke@stanford.edu. 



Eric A. Hanushek is the Paul and Jean Hanna Senior 
Fellow at the Hoover Institution of Stanford Univer- 
sity, chair of the executive committee for the Texas 
Schools Project, and a research fellow of the National 
Bureau of Economic Research. He has written exten- 
sively about the economics and finance of schools. He 
can be contacted at hanushek@stanford.edu. 



The papers in this publication were requested by the National Center for Education Statistics, U.S. Depart- 
ment of Education. They are intended to promote the exchange of ideas among researchers and policymakers. 
The views are those of the authors, and no official endorsement or support by the U.S. Department of Educa- 
tion is intended or should be inferred. This publication is in the public domain. Authorization to reproduce it 
in whole or in part is granted. While permission to reprint this publication is not necessary, please credit the 
National Center for Education Statistics and the corresponding authors. 



117 



118 



Shopping for Evidence Against 
School Accountability 

Margaret E. Raymond 
Eric A. Hanushek 
Stanford University 



Accountability has been a central feature of educational 
policy in a number of states since the 1990s. In part 
because of the perceived success of accountability in 
the states where it was initially tried, federal law in- 
troduced mandatory reporting and accountability 
through the No Child Left Behind Act of 2001. Yet 
not everybody is happy with school accountability. Its 
opponents continue to aggressively search for evidence 
that testing and accountability do not work — or, bet- 
ter, that they are actually harmful. The hope of the 
anti-accountability forces is that they can stop testing 
before it is fully in place and before rollbacks would 
be impossible. 

The window of opportunity to cripple or stop testing 
is narrowing over time, so it is not surprising that hasty 
reports based on biased research should appear. Nor is 
it surprising that these reports are given attention by 
parties who are unschooled in the requirements of good 
research. Perhaps we could disregard these events if 
the policies themselves were unimportant or if public 
exposure to poor quality studies had no effect on the 
ultimate decisions about them. But that is not the 
case. Since testing and accountability represent the cor- 
nerstone of current school reform efforts, it is essential 
that we apply rigorous standards of evidence and of 
scientific method to the analysis of accountability 



policy. The impact of testing and accountability is 
perhaps the most important issue facing school 
policymakers today. Even though accountability, by 
itself, does not say anything about how to organize an 
effective school, measures of school performance pro- 
vide a standardized construction of information needed 
to forge through the bewildering array of “answers” to 
the question of how to improve our schools. While it 
is certainly reasonable to question the effectiveness of 
particular accountability systems and the policy of 
accountability in general, little thought has been given 
to the scientific standards of evidence that ought to 
apply to research and evaluation aimed at informing 
or influencing the policy process in this important area. 

Assessing the impact of state accountability is clearly 
difficult. Policies have been in place for a limited 
amount of time. All states but one have adopted a sys- 
tem in one form or another. Not all accountability 
systems are the same. When put in place, they apply 
to all schools within entire states, limiting relevant 
variation to differences across states. This means that 
we have lost forever the chance to test whether account- 
ability systems are superior to what states had before. 
Finally, accountability systems are just one of many 
ways in which states tend to differ. These factors do 
not imply that gathering evidence about the effects of 
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accountability is impossible. They simply reinforce the 
need to apply strict scientific methods to ensure that 
uncertainty is reduced as far as possible. 

Bad news about accountability gets an undue amount 
of media coverage. First, the anti- accountability forces 
trumpet any possible scrap of data that might be por- 
trayed as generalizable evidence against routine test- 
ing and accountability. Second, researchers reinforce 
this by their popular search for unintended conse- 
quences of government actions. Finally, the press, look- 
ing for both controversy and balance in reporting, tends 
to cite any study — no matter what its scientific qual- 
ity — to show the evenhandedness of its reporting. 

What do we know to date? The ex- 
isting evidence on state accountabil- 
ity systems indicates that their use 
leads to improvement of student 
achievement. States that introduced 
accountability systems during the 
1990s tended to show more rapid 
achievement gains when compared 
to states that did not introduce such 
measures. Along with general im- 
provement, there also appear to be 
instances of unintended conse- 
quences — such as increased special 
education placement or outright 
cheating — at the time of introduc- 
tion, but there is no evidence that 
this continues over time. Looking 
across states, we also know that attaching stakes to 
performance on tests yields better performance. 
Though still preliminary, these findings rest on rig- 
orous analytic techniques, providing policymakers the 
most reliable evidence yet available. 

What do we not know to date? Plenty. We do not know 
which general designs of accountability systems work 
best, or even the best underlying content standards 
for achievement. Nor do we know the optimal way to 
attach rewards and punishments to performance. Who 
should be judged by what scores? These are things 
that will take time to discover, but there is no way to 



get from here to there without a systematic approach 
to future policy enhancements and continued rigor- 
ous evaluation of their effects. 

Evidence About Existing 
Accountability Systems 

Over the past decade, states have devised diverse ac- 
countability systems that differ by choice of test, grades 
monitored, subjects tested, and performance require- 
ments. Direct comparison of state against state based 
on state accountability system information is therefore 
problematic; a common but independent standard of 
comparison is needed. One source of information on 
performance, however, offers some possibility for analy- 
sis. The National Assessment of Edu- 
cational Progress (NAEP), the 
“Nation’s Report Card,” provided per- 
formance information for states dur- 
ing the 1990s. While not designed 
as a national test, these examinations 
provide a highly respected and con- 
sistent tracking of student perfor- 
mance across grades and time. Since 
scores are not reported for individu- 
als or schools, there is no incentive to 
prep for them or to cheat on them. 
We have used these performance mea- 
sures to assess the impacts of state ac- 
countability systems. 

Education is the responsibility of state 
governments, and states have gone in a variety of direc- 
tions in the regulation, funding, and operation of their 
schools. As a result, it is difficult to assess the impacts of 
individual policies without dealing with the potential 
impacts of coincidental policy differences.' 

The basic analysis focuses on growth of student achieve- 
ment across grades.^ If the impacts of stable state poli- 
cies enhance or detract from the educational process 
in a consistent manner across grades, concentrating 
on achievement growth implicitly allows for stable state 
policy influences and permits analysis of the introduc- 
tion of new state accountability policies. 



The existing evidence 
on state accountabil- 
ity systems indicates 
that their use leads to 
improvement of 
student achievement. 



' Hanushek, Rivkin, and Taylor (1996) discuss the relationship between model specification and the use of aggregate state data. The 
development here builds on the prior estimation in Hanushek and Somers (2001) and the details of the model specification and 
estimation can be found there. 

^ Here we summarize the results of the analysis in Hanushek and Raymond (2003a, 2003b). 
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The NAEP testing measured math performance 
of fourth-graders in 1992 and 1996 and of eighth- 
graders 4 years after each of these assessments. While 
the students are not matched, following the same co- 
hort acts to eliminate a variety of potentially confound- 
ing achievement influences. We also supplement the 
raw NAEP data by considering differences in parental 
education levels and in school spending across these 
states. Our analysis of achievement relies on growth in 
achievement in reading and math between fourth- and 
eighth-graders over the relevant 4-year period, e.g., 
growth in achievement from fourth grade in 1996 to 
eighth grade in 2000. Our sample is all states for which 
the relevant NAEP scores are available. 

The potential effects of accountability systems clearly 
depend on when and where these systems were intro- 
duced. Table 1 describes the time path of introduction 
of accountability systems across states by reference to 
the length of time that accountability systems have been 
operating in different states. For these purposes, we de- 
fine accountability systems as those that relate student 
test information to schools and either simply report 
scores or provide rewards and sanctions.^ By looking at 
accountability systems in 1996, it is clear that much of 
the movement to accountability is very recent. In 1996, 
just 10 states had already introduced active account- 
ability systems, while by 2000 only 13 states had yet 
to introduce active systems.^ 



We rely on statistical analyses of differences in NAEP 
growth across states to infer the impact of introducing 
state accountability. Because a differing set of about 
40 states participated in the NAEP testing in each of 
the years, the amount of evidence is limited. None- 
theless, state accountability systems uniformly have a 
significant impact on growth in NAEP scores, while 
other potential influences — spending and parental edu- 
cation levels — do not. 

Figure 1 summarizes the impact of existing state sys- 
tems by tracking the gains in mathematics between 
1996 and 2000 for the typical student who progresses 
from fourth to eighth grade under different systems. 
These expected gains, calculated from regression analy- 
ses of scores on NAEP, illustrate the impact of testing 
and reporting across states.^ States were classified ac- 
cording to the type of accountability system they had 
in place at the time of the NAEP test. (A state’s classi- 
fication could change between the two test years if its 
accountability system had been newly adopted or 
changed in the interim.) The typical student in a state 
without an accountability system of any form would 
see a 0.7 percent increase in the proficiency scores be- 
tween fourth and eighth grades. States with “report 
card” systems display test performance and other fac- 
tors but do not attach sanctions and rewards to the 
information. In many ways, these systems simply serve 
a public disclosure function. Just this reporting moves 



^ We do not include states that place rewards or sanctions (“high-stakes”) just on students, for example through use only of a required 
graduation exam. The school accountability systems are most relevant for No Child Left Behind, but this restriction introduces some 
differences between our analysis and the analysis of Amrein and Berliner (2002) that is analyzed below. 

In all analyses, the universe includes 50 states plus the District of Columbia. Nonetheless, not all states participate in the NAEP exams 
each year, and the samples fall to around 35 in each year. 

^ The details of these estimates can be found in Hanushek and Raymond (2003a). The results pool data on NAEP mathematics gains over 
both the 1992-96 and 1996-2000 periods. 



Table 1. Distribution of states with consequential accountability or reporting system: 1996 and 
2000 



Number of states 

1996 2000 

41 13 

10 38 

NOTE: Distribution inciudes Washington, DC. 

SOURCE: Fietcher and Raymond (2002). 



No system 
System in place 
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Figure 1. Estimated effects of state accountability systems on gains between fourth grade and 
eighth grade for National Assessment of Educational Progress (NAEP) mathematics 
scores: 1996-2000 



Percent mathematics gains 
from fourth to eighth grade 




SOURCE: Author caicuiations from Hanushek and Raymond (2003a, 2003b). 



the expected gain to 1.2 percent. Finally, states that 
provide explicit scores for schools and that attach sanc- 
tions and rewards (what we call “consequential account- 
ability” systems) obtained a 1.6 percent increase in 
mathematics proficiency scores. In short, testing and 
accountability as practiced have led to significant gains 
in student performance over that expected without 
formal systems. 

A complementary analysis by Carnoy and Loeb (2002), 
while not considering the timing of the introduction 
of accountability, includes a rating of the stringency of 
the accountability system that is finer grained than the 
two categories we employ. It also adds information about 
student stakes and accountability. Carnoy and Loeb’s 
findings reinforce the present analysis that account- 
ability increases NAEP performance. A variety of other 
systematic studies of accountability systems within 
states and local school districts have also investigated 
what happens when accountability systems are intro- 
duced. While we describe the evidence in detail else- 
where (Hanushek and Raymond 2003a, 2003b), it 
generally supports two conclusions. First, improve- 
ments in available measures of student performance 
occur after the introduction of an accountability sys- 



tem. Second, other short-run changes — such as increases 
in test exclusions or explicit cheating — are observed. 
In other words, some unintended consequences often 
tend to accompany the introduction of accountability, 
although as of now there is little evidence suggesting 
that these influences continue over time. 

We ourselves have looked explicitly at state differences 
in special education placement rates and whether they 
are related to accountability systems. For the period 
1995-2000, a time of large change in the use of ac- 
countability systems, we see no evidence that increased 
special education placement is a reaction to account- 
ability systems (Hanushek and Raymond 2003a, 
2003b). This analysis does, however, show why some 
could mistakenly conclude that accountability has an 
impact: overall special education placement increases 
within states over this time period, so the introduc- 
tion of accountability systems in the middle of the 
period can look like it influences placement. 

Carnoy and Loeb (2002) also investigate the impacts 
of accountability on grade retention and graduation. 
They demonstrate that there is no discernible nega- 
tive effect on retention and graduation. 
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The set of scientific studies of accountability has been 
presented at a range of scientific conferences, and many 
have undergone peer review for journal publication. 
In fact, because of the importance of the topic, the 
Kennedy School at Harvard held an entire conference 
on accountability in June 2002, and Brookings pub- 
lished the papers in 2003 (Peterson and West 2003). 

The Allure of Counter-Evidence 

In late 2002, Amrein and Berliner, hereafter AB, pro- 
duced a study on the impact of high-stakes account- 
ability systems that garnered considerable attention 
(AB 2002).'’ Their analysis of 28 states considers the 
effects on state-specific NAEP scores and college en- 
trance examination measures in the 
period following adoption of a high- 
stakes accountability program.^ Their 
analysis concludes “there is inadequate 
evidence to support the proposition 
that high-stakes tests . . . increase stu- 
dent achievement” (AB 2002, p. 57). 

The press release that describes the 
report goes further; “The Berliner- 
Amrein analyses suggest that, as in- 
dicated by student performance on 
independent measures of achieve- 
ment, high-stakes tests may inhibit 
the academic achievement of stu- 
dents, not foster their academic 
growth.” 



A closer look at the research, however, shows it to be 
fatally flawed both in design and in execution, render- 
ing the conclusions irrelevant. We consider only the 
effects of accountability systems on NAEP scores in 
the 26 states that AB record as having adopted grade 
school high-stakes tests.® 

It is difficult to ascertain from the main text or the 
technical appendixes exactly what procedures and defi- 
nitions AB employed. AB’s methodology seems best 
described as a “pseudo-trend analysis” with, at times, 
absent baseline data.^ Given the fact that state-level 
NAEP data on the math and reading tests are avail- 
able only for at most four data points, AB essentially 
were confined to performing case studies of individual 
states.'® They purport to examine the 
change in scores before and after the 
accountability system was adopted 
in each state — thus using each state 
as a control for itself. To give some 
independent context for these dif- 
ferences, it appears they also gener- 
ally compared the state change to the 
change that was observed for the 
nation as a whole. States were coded 
as increasing on a particular test if 
the gains in average test results ex- 
ceeded the national average change, 
or coded as decreasing in the oppo- 
site case. Finally, all scores were then 
considered in relation to the relative 



Because of the impor- 
tance of the topic, the 
Kennedy School at 
Harvard held an entire 
conference on account- 
ability in June 2002. 



^ This study is described as having been completed for the Great Lakes Center for Education Research and Practice, a Michigan-based 
think tank. That organization, which is solely financed by National Education Association State Education Affiliate Associations from 
Illinois, Indiana, Michigan, Minnesota, Ohio, and Wisconsin, in turn describes a key element of its mission as being to “connect with like- 
minded organizations to partner on key education initiatives.” 

^ We have not assessed their identification and timing of high-stakes testing, which apparently can relate both to school stakes and 
individual student stakes. 

® Georgia and Minnesota only adopted high school exit requirements, the subject of AB’s technical appendix. There are also strong reasons 
to question their analysis of high school level performance, given the looser degree of correspondence between high school exit requirements 
and college entrance test results, but that discussion necessarily gets into other issues and only distracts from the key linkages to state 
accountability that we emphasize here. 

For example, they most frequently say in the write-ups for individual states things like “After stakes were attached to tests in Maryland, 
grade 4 math achievement decreased” (p. 28). But, since fourth-grade NAEP scores in Maryland, like those in all of their high-stakes tests 
except Delaware in 1992-96, increased in every test year, we infer that they really meant to describe a comparison with the average 
national changes. 

Note that reading and math were tested in different years during the 1990s and that many states did not participate in all four waves of 
NAEP testing. 
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change in exclusion rates between the national aver- 
age and the individual states. Where states’ exclusion 
rates exceeded the national average, AB hypothesize 
that scores should rise because of these exclusions. 
Thus, whenever exclusion rates moved in the same 
direction as the observed NAEP test results, they con- 
sidered the score change contaminated (regardless of 
the magnitudes involved) and eliminated the state 
from further consideration as “Unclear.”" Finally, 
among states that remained (between 8 and 12 de- 
pending on the particular NAEP test), they exam- 
ined the proportion of states with increases versus 
those with decreases relative to the national average. 
Based on this approach, they concluded that “67 per- 
cent of the states posted overall decreases in NAEP 
math grade 4 ... 63 percent of the states posted 
increases in NAEP math grade 8 . . . and 50 percent 
of the states posted increases in NAEP reading grade 
4 as compared to the nation after high-stakes tests 
were implemented.” (AB 2002, p. 56) 

AB violate the first principle of social science research — 
the need to control for the condition of interest. They 
used the 26 states with high-stakes accountability sys- 
tems and limited their analysis to those states alone. 



The natural comparison group, however, is the states 
that had not adopted accountability systems. Such a 
comparison, which offers some insights into the impact 
of high-stakes testing as opposed simply to variations 
among states with high-stakes systems, yields starkly 
different results than their suggested interpretations." 
In fact, their results are completely reversed, putting 
the evidence in line with that previously discussed. 

Table 2 simply compares fourth- and eighth-grade 
NAEP test score gains for the states AB identify as 
implementing high-stakes testing with those that were 
not so identified." For either the entire 1992—2000 
period or the later period of 1996-2000, the average 
gain in math for high-stakes states significantly exceeds 
that for the remaining tested states. The difference in 
performance is always statistically significant at con- 
ventional levels (a nuance that AB never even mention 
in their 236 pages of analysis)." 

AB highlight changes in exclusion rates from test tak- 
ing as a possible influence on state test scores, and 
differences in exclusions between high-stakes states and 
others could influence the performance differentials 
shown. Indeed, many people have suggested that a 



’ ’ In reality, they do not even appear consistent on this, and they violate their own coding scheme more than once. Take, for example. West 
Virginia, where they state: “Overall NAEP math grade 4 scores increased at the same time the percentage of students exempted from the 
NAEP increased. Overall, after stakes were attached to tests in West Virginia, grade 4 math achievement decreased.” [their emphasis] 

" While we reproduce their analysis with a larger set of observations, this should not be construed as an endorsement of the analytical 
approach. More rigorous tools yield more reliable results. We follow their lead in order to show how their answers would have differed 
had they applied their own approach correctly. 

” Note that for each of the comparisons data are available for 34 to 36 states with between 18 and 20 being in the AB high-stakes sample. 
The limited number reflects the varying participation of states in the NAEP testing. 

" Statistical testing is done to guard against changes in test performance that simply reflect random score differences that do not represent 
true differences in student performance. Such random differences could, for example, reflect chance differences in the tested population, 
small changes in question wording, or events specific to the testing in a given year and given state. In their subsequent defense of their 
analysis, AB assert that such testing is unnecessary and may even be inappropriate, but this assertion is obviously incorrect (AB 2003). 



Table 2. Average gains in National Assessment of Educational Progress (NAEP) mathematics 
scores, by Amrein-Berliner (AB) high-stakes states versus other states: 1992-2000 

Change in fourth-grade NAEP mathematics scores Change in eighth-grade NAEP mathematics scores 

1992-2000 1996-2000 1992-2000 1996-2000 



AB high-stakes states 


9.2 


4.2 


8.8 


4.5 


Other states 


3.8 


2.3 


4.0 


1.7 


High-stakes advantage 


5.3 


1.9 


4.8 


2.8 


Statistical significance 


p<.001 


p<.04 


p<.003 


p<.02 



SOURCE: Author calculations. 
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consequence of the introduction of high-stakes testing 
is an increase in test exclusions. 

The hypothesized effect of accountability on test ex- 
clusions does not appear important in explaining the 
aggregate accountability results. For the nation as a 
whole, exclusion rates on the eighth-grade NAEP math 
tests were the same in 2000 as in 1992, while the 
fourth-grade exclusions over that time period fell 
slightly. Table 3 shows evidence for the NAEP exclu- 
sion rates for the 1992-2000 period for the high-stakes 
and non-high-stakes states. While the change in ex- 
clusion rates over the 1990s is slightly higher for high- 
stakes states in the testing of eighth-grade mathematics, 
it is slightly lower for fourth-grade mathematics when 
compared to other states. But neither difference in av- 
erage exclusion by accountability status is statistically 
significant. 

We also standardize the achievement gains for observed 
changes in exclusions through regression analysis. In- 
terestingly, while changes in exclusion rates are sig- 
nificantly related to changes in eighth-grade scores. 



they are not significantly related to changes in fourth- 
grade scores — underscoring the need to analyze cen- 
tral maintained hypotheses. Table 4 compares such 
adjusted estimates of the achievement gain advantage 
of high-stakes tests to the previously unadjusted dif- 
ferences. Again, there are small effects on the esti- 
mated impact of high-stakes testing on gains, but in 
all cases states that introduce high-stakes testing out- 
perform those that do not by a statistically signifi- 
cant margin. In sum, the previous estimates are not 
driven by test exclusions. 

AB’s choice of the pseudo-trend design is even more 
mysterious when one considers that it could not be 
applied squarely to their sample. In eight states — 
Colorado, Indiana, Louisiana, New Jersey, New 
Mexico, Oklahoma, Tennessee, and West Virginia — 
high-stakes testing was identified by AB as having 
been adopted prior to 1990 or in 2000. Because these 
adoptions fall outside of the relevant testing period, 
any pre/post comparison based on NAEP data is im- 
possible. Thus, we refer to their design as “pseudo- 
trend” because they frequently lack data before or 



Table 3. Changes in NAEP mathematics exclusion rates, by Amrein-Berliner (AB) high-stakes 
states versus other states: 1992-2000 



Change in fourth-grade Change in eighth-grade 

mathematics exclusion rates mathematics exclusion rates 





1992-2000 


1996-2000 


1992-2000 


1996-2000 


AB high-stakes states 


3.8 


1.3 


3.4 


2.3 


Other states 


4.1 


2.0 


2.6 


1.9 


High-stakes differential 


-0.3 


-0.7 


0.8 


0.4 


Statistical significance 


p<.76 


p<.44 


p<.40 


p<.64 



SOURCE: Author calculations. 



Table 4. Adjusted average gains in NAEP mathematics scores, by Amrein-Berliner (AB) high- 
stakes states versus other states: 1992-2000 



Change in fourth-grade NAEP mathematics scores 


Change in eighth-grade NAEP mathematics scores 


High-stakes advantage 


1992-2000 


1996-2000 


1992-2000 


1996-2000 


Unadjusted for test exclusions 


5.3 


1.9 


4.8 


2.8 


Statistical significance 


p<.001 


p<.04 


p<.003 


p<.02 


Adjusted for change in test 
exclusions 


5.2 


2.3 


3.7 


2.5 


Statistical significance 


p<.001 


p<.02 


p<.02 


p<.02 



NOTE: Adjusted average gains come from regression of NAEP score changes on exclusion rate changes. 
SOURCE: Author calculations. 
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after the treatment of interest, and they often have 
just two or three test scores that are not even aligned 
with the treatment. For some states, they observe only 
a single test score change, obviously making any pre/ 
post comparison unreliable. 

The use of national average changes in NAEP scores as 
a reference point further confounds the study. Any ef- 
fect of accountability systems is already captured in 
the national score change. By 1996, only 10 states 
had an accountability system in place, so the effect 
might not excessively affect the average. But by 2000, 
a majority of states were on board, so their impacts 
affected the national average change to a much greater 
degree. Late-adopting states are effectively being com- 
pared to other high-stakes states, making it difficult 
to show relative gains and completely rendering moot 
the interpretation that any differences reflect the high- 
stakes treatment. To take a purely hypothetical ex- 
ample, assume that 6 of the high-stakes states gained 
20 percent, while the other 20 gained 2 percent each 
and the no-accountability states made no gains whatso- 
ever — yielding a national average gain of 3 percent. 
AB’s approach would say that accountability had failed: 
just 6 states beat the national average, while 20 were 
below the average. In fact, ignoring any complications 
of exclusions, AB would report this as something like, 
“Just 23 percent of states posted gains on NAEP higher 
than the national average after high stakes were intro- 
duced.” The right approach, of course, would be to 



compare gains of high-stakes states to those of no- 
accountability states. 

A subtler but important issue arises when the timing 
of adoption of an accountability system was bracketed 
by NAEP tests. It is clear that AB did not use a consis- 
tent convention. In some cases, it appears that they 
used the NAEP results from the period immediately 
prior and immediately following adoption of account- 
ability, but in others, it appears that they used a dif- 
ferent time interval, in some cases starting after the 
accountability systems were adopted. The one consis- 
tent choice appears to be reliance on the least flatter- 
ing results (for high-stakes accountability). 

The implications of these nonscientific procedures is 
best seen within the context of their finding of 
“harm.” Table 5 examines the set of states where AB 
concluded that fourth-grade NAEP math scores de- 
creased with the introduction of high-stakes testing. 
For the eight such identified states, we present ag- 
gregate information on testing and results. In three 
of the eight states (New Mexico, Oklahoma, and 
West Virginia), AB identify the introduction of high- 
stakes testing as falling outside the testing period 
(which did not begin until the 1990s). Moreover, 
no real trend data in math gains are available for 
Nevada and Oklahoma, where only a single period 
of test change is observed. During the 1992-96 pe- 
riod when Kentucky, Maryland, and Missouri intro- 



Table 5. Data on NAEP fourth-grade mathematics performance in states identified by Amrein- 
Berliner (AB) as decreasing after the introduction of high-stakes tests: 1992-2000 



States where AB declared 
decreases in NAEP scores 


Introduction of high-stakes 
testing (ABdate) 


1992-1996 


1996-2000 


1992-2000 


Kentucky 


1994 


4.9^ 


1.0 


5.9" 


Maryland 


1993 


3.4^ 


1.6 


5.0" 


Missouri 


1993 


2.5^ 


3.8^ 


6.3" 


Nevada 


1998 


N/A 


IN 


N/A 


New Mexico 


1989' 


0.5 


0.0 


0.6 


NewYork 


1999 


4.2^ 


3.9" 


8.1" 


Oklahoma 


1989' 


N/A 


N/A 


4.7" 


West Virginia 


1989' 


8.P 


1.5 


9.6" 



N/A — NAEP data unavailable for this time period. 

' No NAEP tests at or before introduction of high-stakes testing. 

^Change in NAEP scores exceeds the average change in NAEP both for all states and for states not adopting high-stakes testing. 
^Change in NAEP scores exceeds the average change for states not adopting high-stakes testing. 

NOTE: Bold entries highlight evidence concerns discussed in text. 

SOURCE: Author calculations. 
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duced high-stakes testing, two had math gains ex- 
ceeding the average for all tested states, and one had 
gains that just exceeded the average for states that 
did not introduce high-stakes testing.'^ Nevada, 
which they record as introducing high-stakes testing 
in 1998, had gains during 1996-2000 that exceeded 
gains for non-high-stakes states. Over the entire pe- 
riod of 1992-2000, five of the six states for which 
data are available showed gains that at least exceeded 
the average for non-high-stakes tests; New York and 
West Virginia exceeded the average for all states. And 
this is the group of states that AB identify as being 
harmed by high-stakes testing! Not a single state pro- 
vides evidence of harm following the introduction of 
high-stakes testing. When read cor- 
rectly, if anything, the evidence 
points to generally higher perfor- 
mance in this group of states. 

The final blow to the credibility of 
AB’s results comes at the point of 
drawing inferences based on their 
analysis. Regardless of the choice of 
design, and ignoring the selective use 
of NAEP scores, we would still ex- 
pect AB to consider all the available 
data as they had constructed it to draw 
conclusions. But they did not. First, 
they eliminated all information about 
the magnitude of score changes, re- 
lying solely on whether scores increased or decreased. 
Second, they eliminated all the states that they judged 
to be “unclear,” which reduced the final tally to “im- 
proved vs. declined” instead of “improved vs. all states 
that adopted high-stakes.”'® For instance, they re- 
corded positive or negative results on the NAEP fourth- 
grade math test for just 12 of the 26 states with 
high-stakes for grades K-8. AB found that fourth-grade 
math scores increased at a slower rate than the na- 
tional average in eight of the remaining states (those 
in table 5), faster in just four. Yet they write this up in 
a highly misleading fashion, claiming “67 percent of 



the states posted overall decreases in NAEP math grade 
4 performance as compared to the nation after high- 
stakes tests were implemented.” Actually, AB witnessed 
gains slower than the national average in just 8 of 26 
high-stakes states, or 31 percent. 

Instead of concluding that the evidence does not sup- 
port the proposition that high-stakes accountability 
increases student achievement, it would be more ac- 
curate to say that the chosen evidence by AB does not 
support any inference at all. 

Simply applying the underlying approach of AB to all 
of the data on NAEP achievement completely reverses 
their conclusions. Ffigh-stakes test 
states on average perform significantly 
better than non-high-stakes states. 
For the reasons described previously, 
we still do not think that these simple 
comparisons are the best way to ana- 
lyze this question, but this analysis 
demonstrates that there is no differ- 
ence in the broad results from their 
crude approaches and the preferred 
analytical approaches we described 
previously. 

Not in a Vacuum 

The competing evidence on account- 
ability program performance raises a number of dis- 
turbing issues. One is how unaware or indifferent the 
media and many policymakers are to quality differ- 
ences in the available evidence. The recent publicity 
surrounding the AB essay highlights the vulnerability 
of key public policy initiatives to faulty evidence and 
badly informed reporting.'^ Distinct from other policy 
fields, reports in education seem to be taken at face 
value or — worse — on the political orientations of the 
authors, independent of the rigor of the analysis or 
the suitability of the inferences that are drawn. While 
the most obvious example recently concerned the me- 



The competing evi- 
dence on accountabil- 
ity program perfor- 
mance raises a number 
of disturbing issues. 



In terms of what periods were looked at by AB, it is difficult to come up with the rule for decisions on NAEP scores that includes both 
Maryland and Missouri as “decreasing” states. 

As described above, the label “unclear” rests on their strong and untested hypothesis about the impact of exclusion rates on scores. Results 
are unclear whenever the movement in exclusion rates is the same direction as the movement in test scores, regardless of the magnitude 
of either change. 

Most notable among the publicity was a front page article in the New York Times (Winter [2002]). A link to this article currently appears 
on the home page for the Great Lakes Center for Education Research and Practice: http://vyvyw.nytimes.eom/2002/12/28/education/ 
28EXAM.html . Other newspapers and professional publications dutifully provided their own reporting of the AB results. 
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dia, the problem applies as well to many other actors 
on the education landscape, including the legislative 
and executive leadership in many states. 

The issue of evidence quality is of prime importance 
when individuals serve a gatekeeper function for dis- 
seminating information to the general public. The me- 
dia acts as a filter to select issues that merit attention 
and then distills them into a few key points. 
Decisionmakers in education agencies serve a similar 
function when they attempt to reflect the effective- 
ness of the programs they have implemented. Indi- 
viduals in these positions are trusted, and expected, to 
go beyond the press release or a superficial examina- 
tion of a report or analysis by checking the facts, gaug- 
ing the credibility of the analytic 
approach, and vetting the results. We 
would certainly expect this if the 
topic under investigation were an al- 
legation of fraud or a new break- 
through in power generation. We 
need similar assurances in education. 

Perhaps this disregard is understand- 
able when one considers that the is- 
sue of the quality of evidence has only 
recently been raised among educators 
themselves. A recent National Re- 
search Council panel was convened to 
assess “scientific principles for educa- 
tion research” — a type of inquiry un- 
heard of in other research and policy fields (Shavelson 
and Towne 2002). Most schools of education offer 
courses in research methods as part of the curriculum, 
but a wide variety of techniques are taught, determined 
in no small part by the training and interests of the 
faculty teaching the courses and not limited to tradi- 
tional scientific inquiry. This is not to say that there 
are no appropriate uses for the variety of analytic skills 
that are taught. However, when significant public poli- 
cies involving many millions of dollars are on the line, 
as in the case of school and student accountability pro- 
grams, evidence must meet the highest scientific stan- 
dards. The analysis should be rigorous enough to 



consider and objectively to control for potentially com- 
peting explanatory factors, and the resulting evidence 
must be reliable. It should not be argued on political 
grounds masquerading as science.'® 

Federal policy has also taken a turn towards more 
stringent standards of evidence. The No Child Left 
Behind Act includes strong requirements for employ- 
ing educational programs based on solid scientific 
research. The creation of the Institute of Education 
Sciences is clearly directed at improving the quality 
of educational research. And, reacting to obvious qual- 
ity concerns about research that was being used to 
support policy, the U.S. Department of Education 
in 2002 funded the What Works Clearinghouse to 
establish strict scientific criteria for 
studies on program performance. In 
an effort to provide a “trusted source 
of scientific evidence,” the Clearing- 
house is designed to concentrate pri- 
marily on the quality of the research 
design and the rigor of the analytic 
techniques. (See http://w-w-c.org) 

Reporters should not be expected to 
be experts in statistical analysis any 
more than they are expected to be 
fully versed in biochemistry or in- 
vestment banking regulations. But 
it is not unreasonable to hold up a 
standard of reasonable scrutiny 
(bringing in expertise if needed as is done for medi- 
cal and scientific reporting). 

It is also not as if the issue is unimportant. Improving 
our educational performance would arguably lead to 
greater gains for society than any of the medical break- 
throughs of the past decade. For example, had there 
been true educational improvements following A Na- 
tion at Risk — putting U.S. student achievement on par, 
say, with that of students in better performing Euro- 
pean countries — it has been estimated that the GDP 
of the United States would have expanded sufficiently 
by 2002 to pay for all K-12 expenditures."’ 



When significant public 
policies involving many 
millions of dollars are 
on the line, evidence 
must meet the highest 
scientific standards. 



See the debates about the effectiveness of accountability systems that entered into the 2000 presidential elections; Grissmer et al. (2000), 
Klein et al. (2000), and Hanushek (2001). 

See Hanushek (2003a, 2003b) ( http://www.educationnext.org/20032/index.html ). 
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What We Do Not Know 

We have suggestive evidence that accountability as 
implemented in the 1990s has been helpful. It is clear 
that, for one reason or another, performance has been 
better in accountability states than in nonaccountability 
states. We also have evidence that a number of unin- 
tended consequences have followed the introduction of 
accountability. We do not wish to suggest that we yet 
have anywhere near the amount of reliable evidence that 
is needed for developing fully satisfactory testing and 
accountability systems. But this is far different from 
completely retreating from assessing and reporting 
schooling outcomes. 

The findings leave us short of what we would like to 
know for policy purposes.^® We do not understand how 
best to design accountability systems that can be di- 
rectly linked to incentive systems. For example, the 
vast majority of state accountability systems report 
average performance for each school on various state 
tests. These are sometimes disaggregated for, say, race 
and ethnic groups. But, because these average scores 
are highly dependent on factors outside the control of 
schools — such as families and friends — it would not 
be appropriate to base school performance rewards on 
these unadjusted average scores. Doing that would 
encourage schools to concentrate more on who is tak- 



ing the test than on how their scores can be improved. 
Incentives are best attached to the value-added for 
which schools and teachers are responsible. 

Similarly, uncertainty remains about the best set of 
tests to measure accomplishment of the learning stan- 
dards of each state. Concerns about any possible nar- 
rowing of the curriculum or inappropriate changes in 
instructional practice are in large part concerns about 
the quality of the testing — because the entire intent 
of the accountability systems is that teachers do in 
fact teach to a well-designed set of tests that adequately 
reflect the range of material that students should know. 

Federal legislation in the No Child Left Behind Act 
represents an important starting point in a process to 
improve the performance of our schools. It established 
the necessity for regular annual testing of students and 
the public reporting of results. It also made some 
guesses about how to build incentives and require- 
ments into the system. The hope (and intent) of the 
anti-accountability forces is that regular testing and 
reporting be nipped in the bud. The challenge to ev- 
erybody is ensuring that we learn about accountabil- 
ity and adjust any current flaws before the 
anti-accountability forces succeed. Their success would 
surely leave our children and our nation worse off. 



20 



Issues of accountability system design and of incentive aspects of accountability systems are discussed in Hanushek and Raymond 
(2003b). These analyses also assess the available evidence on various design issues. 
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